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TRACKING  IN  UNCERTAIN  ENVIRONMENTS* 


by 


D.  J.  Salmond 


SUMMARY 


This  study  concerns  the  problem  of  tracking  a  target  when  the  origin  of  the 
sensor  measurements  is  uncertain.  The  full  Bayesian  solution  to  this  type  of  prob¬ 
lem  gives  rise  to  Gaussian  mixture  distributions,  which  are  composed  of  an  ever 
increasing  number  of  components.  To  implement  such  a  tracking  filter,  this  growth 
of  components  must  be  controlled  by  approximating  the  mixture  distribution. 

Two  algorithms  have  been  developed  for  approximating  Gaussian  mixture  dis¬ 
tributions.  These  techniques  attempt  to  minimize  the  number  of  mixture  components 
without  modifying  the  'structure'  of  the  distribution  beyond  a  specified  limit. 

Also  the  final  approximation  is  itself  a  Gaussian  mixture. 

The  performance  of  the  algorithms  has  been  assessed  by  simulation  for  the 
problem  of  tracking  a  single  target  in  the  presence  of  uniformly  distributed  false 
measurements.  This  assessment  indicates  the  significant  range  of  problem  para¬ 
meters  where  the  new  algorithms  give  a  substantial  performance  improvement  over  the 
well  known  Probabilistic  Data  Association  Filter  (which  approximates  the  mixture  by 
a  single  Gaussian  component) . 

The  tracking  example  is  extended  in  the  second  part  of  this  study  to  show  how 
the  Bayesian  approach  may  be  applied  to  more  complex  uncertain  tracking  problems, 
including  that  of  fusing  data  from  several  independent  sources.  In  particular  a 
computationally  efficient  filter  is  derived  which  improves  the  track  estimate  from 
a  primary  sensor,  by  making  sub-optimal  use  of  measurements  from  an  auxiliary 
sensor.  Finally,  a  general  solution  is  derived  for  a  tracking  problem  with  multi¬ 
ple  measurement  classes.  This  general  solution  is  used  to  derive  a  filter  for 
tracking  a  target  in  the  presence  of  intermittent  interfering  measurements,  in 
addition  to  uniformly  distributed  false  measurements. 
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1  INTRODUCTION 


1 . 1  Background 


'A  tracking  filter  is  an  algorithm  for  estimating  the  state 
(such  as  position  and  velocity)  of  an  object  from  measurements 
of  a  sensor  such  as  a  radar.  Following  the  usual  convention,  an 


/ 


object  being  tracked  will  be  called  a  target.  A  basic  assumption 


of  most  tracking  filters,  such  as  the  a-g  filter  and  other  filters 
derived  from  Kalman  filter  theory,  is  that  only  measurements  from 
the  target  of  im.ei.cst  are  passed  to  the  filter.  However  in 
practice,  sensors  produce  measurements  as  a  result  of  random  noise, 
clutter,  interference  and  other  targets,  in  addition  to  those  from  the 
required  target.  Usually  it  is  not  possible  to  distinguish  with 
certainty  between  the  wanted  and  the  unwanted  measurements.  Hence 
there  is  a  need  for  tracking  filters  which  recognize  that  some  of  the 
received  measurements  may  not  originate  from  the  required  target. 


Measurement  origin  uncertainty  is  most  commonly  encountered  in 

the  context  of  muliple  target  tracking,  although  in  this  study  we  shall 

only  be  concerned  with  the  single  target  case.  A  number  of  approaches 

to  the  uncertain  tracking  problem,  with  the  emphasis  on  multiple  target 

tracking,  are  reviewed  in  the  recent  books  by  Blackman^  and  Bar-Shalom 
2  3-5 

and  Fortmann  ,  and  the  survey  papers  .  There  are  essentially  two 
types  of  approach  to  estimation  in  the  presence  of  uncertainty:  the 
decision-directed  approach  where  decisions  are  taken  and  assumed  to  be 
true,  and  Bayesian  techniques  which  allow  for  the  possibility  that  the 
most  likely  option  may  be  incorrect. 


The  simplest  decision-directed  technique  is  the  'nearest  neighbour' 
filter:  the  track  is  updated  with  the  measurement  which  is  in  some  sense 
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closest  to  the  expected  target  position.  This  is  likely  to  give  a  poor 

result  if  several  measurements  occur  in  the  vicinity  of  the  expected 

target  position.  In  these  circumstances  a  branching  or  track  splitting 

filter  offers  an  improvement:  a  separate  branch  is  propagated  for  each 

possible  measurement.  The  growth  of  tracks  is  controlled  by  merging 

similar  branches  or  by  deleting  branches  if  the  likelihood  function 

(or  the  support)  of  that  branch  falls  below  a  certain  threshold  (see 

Smith  and  Buechler^) .  A  more  sophisticated  approach  is  to  choose  the 

most  likely  hypothesis  from  the  set  of  feasible  hypotheses  on  the 

association  of  all  measurements  that  have  been  received.  This  is  a 

batch  processing  task  (see  Morefield^)  which  should  provide  an  optimal 

solution  in  the  maximum  likelihood  sense.  Sequential  versions  of  this 

method  have  also  been  derived.  These  are  computationally  convenient 

8  9 

but  sub-optimal  (see  Sittler  ,  and  Stein  and  Blackman  ). 

For  this  present  study,  the  Bayesian  approach  has  been  adopted. 

As  already  indicated,  this  approach  avoids  the  need  to  make  'hard' 
decisions  among  quite  probable  hypotheses.  Also  an  obvious  implementation 
is  via  a  recursive  filter  which  is  convenient  for  real  time  processing. 
However  the  full  Bayesian  solution  is  impractical  and  some  approximation 

is  essential;  promising  results  have  been  obtained  by  a  number  of 

.  10-26  .  ,  .  ,  ,  .  ^ 
authors  .  Approximation  of  the  optimal  solution  is  one  of  the 

main  subjects  of  this  study. 

An  approximate  Bayesian  filter  for  the  problem  of  tracking  a 

a  single  target  in  clutter  was  first  formulated  by  Singer  at 

For  the  same  problem,  a  very  efficient  approximation  technique 

known  as  the  Probabilistic  Data  Association  Filter  (PDAF)  was 

proposed  by  Bai — Shalom  and  Tse^.  Various  e>  tensions  of  the 

.  ,  12 

basic  PDAF  for  special  cases  including  target  maneouvres  ,  random 
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measurement  arrival  times  and  dual  sensors  have  been  developed  by 
Bar-Shalom  and  co-workers.  An  extension  of  the  PDAF  to  the  multiple  target 

ca'e  was  reported  by  Bar-Shalom1^  and  Fortmann  et  ai1^’1^  (also  see  Refs  20 

23 

to  22) .  An  important  paper  by  Reid  presents  a  Bayesian  multiple 

target  filter  which  does  not  use  the  PDAF  approximation.  The  branching 

algorithm  of  Smith  and  Buechler^  may  be  viewed  as  a  much  simplified 

version  of  this  filter.  More  recent  work  on  Bayesian  multiple  target 

26 

tracking  is  reported  by  Mori  et  aZ 
1 . 2  The  Bayesian  approach 

In  the  Bayesian  approach  to  tracking,  one  attempts  to  construct 
the  probability  density  function  (pdf)  of  the  target  state  x  ,  based 
on  all  available  information  including  the  set  Z  of  received  measure- 
ments  .  The  required  conditional  pdf  of  x  may  be  written  p(x!z)  . 

Since  this  pdf  embodies  all  available  statistical  information,  it  may  be 
said  to  be  the  complete  solution  of  the  tracking  problem.  In  principle, 
an  optimal  estimate  of  x  for  any  criterion  may  be  obtained  from 
p(x|Z)  .  A  measure  of  the  accuracy  of  the  estimate  may  also  be  derived 
from  p(x|z)  .  Clearly  it  is  most  desirable  to  obtain  this  conditional 
pdf  whenever  an  estimate  of  the  target  state  is  required. 

For  many  tracking  problems  an  estimate  is  required  every  time 
that  a  set  of  sensor  measurements  is  received.  In  this  case  a  recursive 
filter  is  a  convenient  solution.  Such  a  filter  consists  of  essentially 
two  stages:  prediction  and  update.  For  prediction  it  is  assumed  that 
an  equation  describing  the  evolution  of  the  target  state  is  available. 

This  can  be  used  to  predict  the  pdf  ot  state  forwards  from  one  measure¬ 
ment  time  to  the  next.  Since  the  target  is  usually  subject  to  unknown 
disturbances,  prediction  usually  increases  the  covariance  of  the  state 

pdf.  The  update  operation  uses  the  latest  set  of  measurements  to 

. _  -- 


modify  the  predicted  pdf.  This  is  conveniently  achieved  using  Bayes 
.ieoretn  which  is  the  mechanism  for  updating  a  pdf  or  probability  in  the 
light  of  extra  information  from  new  data. 


For  estimation  problems  where  the  origin  of  measurements  is 

28-3 7 

known,  Eayes  theorem  leads  tc.  the  Kalman  filter  update  relations 

provided  that  the  problem  is  linear  and  all  random  elements  are  Gaussian 

(see  Appendix  A) .  This  is  the  optimal  tracking  filter  for  this  standard 

tracking  problem,  and  in  this  case  p(x|Z)  is  a  Gaussian  pdf.  This  is 

not  so  if  the  measurement  origin  is  uncertain.  To  construct  the 

required  pdf  in  this  case  it  is  necessary  to  take  account  of  all 

possible  measurement  associations.  For  each  of  these  possibilities 

or  hypotheses,  there  is  a  corresponding  Gaussian  pdf  of  target  state. 

33  34 

Thus  the  overall  pdf  of  target  state  is  a  Gaussian  mixture  pdf  ’  of 
the  form: 


p(x ! Z) 


£ 


e£  p£(x) 


(i.i) 


where  P£(x)  is  the  Gaussian  pdf  corresponding  to  hypothesis  i  , 

N  is  the  total  number  of  feasible  hypotheses  at  this  time  and  3^  is 
the  probability  that  hypothesis  i  is  correct,  such  that: 


8 .  >  0  and 

i 


When  new  measurements  are  received  for  the  update  of  this  pdf,  the 
number  of  feasible  hypotheses  from  past  measurements  is  compounded  by 
origin  uncertainty  in  the  latest  set.  Since  the  probability  and  the 
pdf  corresponding  to  each  of  these  hypotheses  have  to  be  updated  via 
Bayes  theorem,  it  is  clear  that  the  computational  requirements  of  the 


full  Bayesian  solution  increase  rapidly  as  tracking  proceeds.  This  is 
the  major  difficulty  of  the  Bayesian  approach  to  the  uncertain  tracking 
problem. 

1  ^  A  practical  sub-optimal  filter 

To  implement  a  Bayesian  filter  it  is  essential  to  contain  these 
computational  requirements  within  acceptable  bounds  by  making 
approximations.  Any  approximation  which  changes  p(x|z)  renders  the 
filter  suo-optimal,  so  the  aim  should  be  to  achieve  the  necessary 
reduction  in  computation  with  minimal  performance  penalty.  At  each 
measurement  time,  it  is  usual  practice  to  subject  the  received  measure¬ 
ments  to  a  coarse  acceptance  test.  This  rejects  any  data  that  are  very 
unlikely  to  originate  from  the  target,  so  that  very  improbable  hypotheses 
are  not  considered.  However  origin  uncertainty  amongst  the  accepted 
measurements  may  still  cause  the  number  of  mixture  components  of 
equation  (1.1)  to  grow  rapidly.  Thus  further  direct  approximation  of 
the  mixture  distribution  may  be  necessary.  Unlike  the  acceptance  test 
this  approximation  may  result  in  a  significant  modification  of  the 
complete  solution,  and  so  the  choice  of  approximation  should  be 
carefully  considered. 

As  already  mentioned,  the  PDAF1 1  is  a  popular  and  economical  scheme 
for  approximating  the  mixture.  This  method  reduces  the  complete  mixture 
to  a  single  Gaussian  component  after  processing  each  set  of  sensor  measure¬ 
ments.  However  this  may  destroy  valuable  information,  especially  if 
several  significant  well  spaced  components  are  present.  To  provide  a 
better  approximation  to  the  mixture,  two  new  algorithms  (the  Clustering 
Algorithm  and  the  Joining  Algorithm)  have  been  derived  which  allow 
more  than  one  component  to  be  retained.  These  mixture  reduction 
algorithms  operate  by  merging  similar  components  together,  and  the;- 


are  based  on  the  requirement  that  reduction  should  proceed  with 
minimal  modification  to  the  'structure'  of  the  distribution  (see 
Chapter  3) . 

1 .4  The  baseline  problem  and  simulation  studies 

A  baseline  problem  has  been  chosen  for  this  study  to  provide  a 
specific  example  of  the  growth  in  the  number  of  measurement  association 
hypotheses,  and  to  show  how  the  reduction  algorithms  may  be  applied  to 
control  this.  The  problem  is  to  track  a  single  target  from  sensor  data 
which  includes  spurious  as  well  as  useful  measurements.  A  set  of  sensor 
measurements  is  produced  at  discrete  time  intervals.  Each  set  is  com¬ 
posed  of  at  most  one  true  measurement  which  originates  from  the  target, 
and  a  number  of  false  measurements  which  are  uniformly  distributed  over 
the  measurement  space  and  are  independent  of  the  target.  The  true 
measurement  has  a  Gaussian  distribution  about  the  target  position  and 
it  cannot  be  distinguished  from  the  false  measurements.  The  target 
moves  according  to  a  linear  model  driven  by  Gaussian  noise.  The 
full  Bayesian  solution  to  this  problem,  which  has  been  considered 
by  several  authors  ,  is  derived  m  Chapter  2.  All  of  the 
tracking  examples  considered  in  this  study  are  variations  of  this 
single  target  pro'  lem.  The  single  target  case  suffices  to  investigate 
the  trade  off  between  complexity  and  filter  performance.  Also  the 
techniques  developed  here  could  be  adapted  for  the  multiple  target 
problem. 

For  our  purposes,  the  performance  of  the  mixture  reduction 
algorithms  depends  on  the  performance  of  the  tracking  filters  that 
employ  them.  The  primary  measure  of  filter  performance  chosen  for 
this  study  is  the  average  time  for  which  the  filter  maintains  track 
on  a  target,  ic  the  average  track  lifetime.  Since  there  is  no  tractable 


analytical  means  of  evaluating  this  performance  measure,  Monte  Carlo 
simulations  have  been  carried  out  for  a  particular  example  of  the 
baseline  problem.  In  this  example,  which  is  also  used  by  Bar-Shalom 
and  Birmiwal^,  the  target  moves  in  a  plane,  the  target  kinematics  are 
described  by  a  second  order  model  and  sensor  measurements  consist  of 
Cartesian  co-ordinate  pairs.  The  'difficulty'  of  this  tracking 
problem  may  be  easily  controlled  by  adjusting  several  problem 
parameters.  In  Chapter  4,  the  performance  of  tracking  filters  using 
the  new  reduction  algorithms  is  examined  in  detail  for  a  single  set  of 
problem  parameters.  In  particular  the  effect  of  varying  the  maximum 
number  of  mixture  components  retained  by  the  reduction  algorithms  is 
investigated.  For  this  and  other  simulations  in  this  study,  the  PDAF 
provides  the  performance  reference  against  which  other  filters  are  com¬ 
pared.  The  results  of  Chapter  4  indicate  that  the  Clustering  Algorithm 
is  more  computationally  efficient  than  the  Joining  Algorithm,  and  so 
from  Chapter  5  onwards  the  former  reduction  technique  is  employed. 

In  Chapter  5,  the  performance  of  a  filter  using  the  Clustering 
Algorithm  is  compared  with  the  PDAF  over  a  wide  range  of  problem 
parameters  for  the  simulation  example.  The  new  filter  should  always 
outperform  the  PDAF,  since  the  Clustering  Algorithm  retains  more  inform¬ 
ation.  We  have  attempted  to  identify  the  approximate  region  of  the 
problem  parameter  space  where  the  performance  of  the  Clustering 
Algorithm  filter  is  significantly  better  than  the  PDAF,  ie  where  it  is 
worth  retaining  more  than  one  component. 

A  second  'sector  scan'  example  of  the  baseline  problem  is  considered 
in  Chapter  6.  This  example  has  been  used  to  examine  the  effect  of  several 
practical  filtering  difficulties.  These  include  sensor  measurements  in 
polar  co-ordinates  which  are  a  non-linear  function  of  the  target  state, 
and  target  manoeuvres  which  are  not  correctly  represented  by  the  filter's 


assumed  target  model. 
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As  well  as  providing  a  tool  for  evaluating  a  performance  measure, 
simulation  is  a  useful  aid  to  understanding  the  operation  of  a  filter. 


For  this  study  the  simulation  programs  have  been  designed  so  that 
either  multiple  replications  can  be  performed  to  generate  performance 
statistics,  or  single  runs  can  be  carried  out  to  examine  filter 
operation  in  detail.  F-">r  multiple  runs,  overall  performance 
measures  are  produced  together  with  a  summary  of  the  results  of 
each  individual  replication,  including  its  random  number  seeds. 

Thus  any  replication  may  be  rerun  with  the  program  in  single 
replication  mode  to  produce  detailed  output  files  for  a  thorough 
analysis  of  filter  operation.  All  simulation  programs  were 
written  in  Fortran  77,  and  use  of  the  Cray  IS  computer  at  RAE 
Famborough  enabled  an  extensive  range  of  simulation  experiments 
to  be  performed. 

1 .5  Extensions  of  the  baseline  problem 

The  final  part  of  this  study  is  concerned  with  extensions  of  the 
baseline  problem.  In  Chapter  7  we  consider  the  problem  of  fusing 
information  from  a  number  of  sources.  For  many  sensors  it  is  possible 
to  obtain  information  on  the  origin  of  a  measurement  by  analysing  the 
signal  from  which  it  is  derived.  For  example,  the  shape  of  the  return 
from  a  pulse  radar  or  the  fluctuation  over  several  returns  may  indicate 
whether  the  measurement  originates  from  clutter  or  from  a  true  target. 

Clearly  the  filter  should  make  use  of  this  signature  information,  and 

35  .... 

Nagarajan  et  at  show  how  it  may  easily  be  included  in  the  Bayesian 

formulation  to  modify  p(xlz)  .  Also  in  many  tracking  systems,  measure¬ 
ments  are  available  from  several  independent  sensors.  Data  from  each  of 
these  sensors  may  be  incorporated  sequentially  because  they  are  independent. 
In  Section  7.4  we  consider  the  particular  data  fusion  problem  of  combining 
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information  from  a  primary  sensor  which  produces  range  and  bearing 
position  measurements  with  an  independent  auxiliary  sensor.  The 
auxiliary  sensor  gives  only  bearing  information  but  it  does  include 
an  imperfect  classification  of  each  of  its  measurements.  A  new  filter 
has  been  derived  for  this  problem  which  uses  the  auxiliary  measurements 
ir.  a  sub-optimal  but  efficient  way.  The  performance  of  the  filter  is 
compared  with  the  single  sensor  filter  to  show  the  value  of  sub-optimal 
processing  of  the  auxiliary  measurements. 

For  Chapters  2  to  7  it  is  assumed  that  a  measurement  from  a  given 
sensor  is  a  sample  from  one  of  two  distributions:  true  or  false.  In 
Chapter  8  we  extend  this  to  allow  for  samples  from  more  than  two 
distributions,  ie  more  than  two  classes  of  measurement  are  allowed. 

The  general  solution  to  this  problem  is  derived.  This  general  solution 
is  used  to  develop  a  practical  filter  for  tracking  a  target  in  the 
presence  of  intermittent  interference,  in  addition  to  the  usual  false 


measurements . 
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THE  BASELINE  PROBLEM:  TRACKING  A  SINGLE  TARGET  IN  THE  PRESENCE 
OF  RANDOM  UNIFORMLY  DISTRIBUTED  FALSE  MEASUREMENTS 

2 . 1  Introduction 

In  this  chapter  a  formal  statement  of  the  baseline  problem  is 
given  and  the  optimal  Bayesian  solution  of  this  problem  is  derived. 

This  problem,  which  is  taken  from  Refs  10  and  11,  provides  a  convenient 
example  which  illustrates  many  of  the  difficulties  of  uncertain  tracking, 
and  it  is  a  suitable  basis  for  extension  to  more  complex  problems. 

A  full  account  of  the  solution  is  presented  here  to  facilitate 
the  description  of  extensions  given  in  later  chapters.  The  major  result 
is  that  the  posterior  pdf  of  target  state  at  each  time  step  is  a 
Gaussian  mixture  and  that  the  number  of  components  which  comprise  this 
mixture  increases  with  time.  This  is  confirmed  by  induction.  Assuming 
that  the  prior  pdf  of  the  target  state  at  time  step  k  is  a  Gaussian 
mixture,  the  posterior  distribution,  after  updating  with  the  measure¬ 
ments  received  at  this  time  step,  is  shown  to  be  another  Gaussian 
mixture  with  an  increased  number  of  components  (sections  2.3.1  and 
2.3.2).  Using  the  target  model,  this  posterior  pdf  is  projected 
forwards  to  show  that  the  prior  pdf  at  the  following  time  step  k+1 
is  also  a  Gaussian  mixture  (section  2.3.3),  so  completing  the  proof. 

The  recurrence  relations  for  updating  and  prediction  are  given,  and 
the  solution  is  seen  to  be  equivalent  to  a  bank  of  parallel  Kalman 
filters  whose  number  grows  with  time.  The  significance  of  an  optimal 
solution  requiring  propagation  of  an  ever  increasing  number  of 
Gaussian  components  is  discussed  in  section  2.4. 


2.2  Problem  statement 


I  7 


The  problem  is  to  provide  an  estimate  of  the  state  x  of  a  single 
target  at  discrete  time  steps,  based  on  all  the  available  information. 
The  state  vector  x  typically  consists  of  target  position  and  velocity, 
but  other  attributes  of  the  target  may  also  be  included.  It  is  assumed 
that  x  evolves  according  to  a  linear  recurrence  relation  of  the  form: 

Vi  =  +  r*k  ’  (2-1} 


where  x^  is  the  n-component  state  vector  at  time  t  , 

<t>  is  the  n  x  n  state  transition  matrix, 

F  is  an  n  x  r  matrix 

and  w  is  an  r-component  vector  of  system  driving  noise  which  has 
a  Gaussian  distribution  with  zero  mean  and  covariance  given  by: 


E 


Q«. 


ik 


Here  Q  is  a  positive  definite  r  x  r  matrix  and  6^  is  the  Kronecker 
delta.  Equation  (2.1)  describes  the  kinematics  of  the  target  and  is 
known  as  the  target  model.  Initially,  at  time  t^  ,  the  state  vector 
x^  is  assumed  to  have  a  Gaussian  distribution  with  known  mean  x^  and 
covariance  (a  positive  definite  n  x  n  matrix). 


At  every  time  step 
and  passes  a  set  of 


k  » 


\ 


a  single  sensor  scans  a  surveillance  region 
measurements  to  the  tracking  filter: 


*k 


Each  measurement  z^j  is  a  u-component  vector.  It  is  assumed  that  the 
target  is  well  inside  the  surveillance  region  of  the  sensor,  but  that 
the  (known)  probability  P^  of  detecting  the  target  may  be  less  than 
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unity.  It  is  also  assumed  that  at  most  one  of  the  measureiatnts  may 

originate  from  the  target.  If  measurement  z,  .  does  originate  from  the 

-k  j 

target, then  it  is  related  to  the  state  vector  by  the  linear  relationship: 

-kj  =  nSk+*k  ’  (2>2) 

where  H  is  the  u  *  n  measurement  matrix 

and  v^  is  a  u-component  vector  of  measurement  noise  which  has  a 
Gaussian  distribution  with  zero  mean  and  covariance  given  by: 


Here  R  is  a  positive  definite  u  *  u  matrix  and  6.^  i-s  the  Kronecker 
delta.  A  measurement  which  originates  from  the  target  is  said  to  be 
true,  while  all  other  measurements  are  false.  A  false  measurement  is 
assumed  to  be  independent  of  the  state  vector,  to  have  a  uniform 
distribution  over  the  surveillance  region  of  the  sensor  and  to  be 
independent  of  all  other  present  and  past  measurements.  False  measurements 
are  assumed  to  occur  at  an  average  density  of  p  per  unit  area. 

Further  it  is  assumed  that  before  examining  the  values  of  the  measure¬ 
ments  in  the  set  ,  there  is  no  information  on  which,  if  any,  of  the 
measurements  are  associated  with  the  target. 

The  following  information  is  available  to  the  tracking  filter: 

(i)  The  distribution  of  the  initial  state  vector  including  its 
mean  x^  and  covariance  . 

(ii)  The  target  model,  equation  (2.1),  including  $  and  ?  . 

(iii)  The  relationship  between  the  state  vector  and  the  true 
measurement,  equation  (2.2),  including  H  . 
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(iv)  Ttv  statistics  of  the  false  measurements,  the  true 

measurement  and  the  model  driving  noise,  including  p  , 

R  and  Q  . 

(v)  The  detection  probability  PQ  of  the  sensor. 

(vi)  The  measurement  sets  for  all  past  and  current  time 

steps . 

The  tracking  filter  does  not  know: 

(i)  The  values  of  the  state  vector  x^  ,  or  the  noise  vectors 
v  and  w  at  any  time  step. 

*”rC  ““K. 

(ii)  The  identity  of  the  true  measurement. 

Note  that  if  the  identity  of  the  true  measurement  were  known,  the 
problem  would  reduce  to  that  of  the  standard  Kalman  filter  (see 
Appendix  A) . 

2.3  The  Bayesian  solution 

2.3.1  The  prior  distribution  of  the  state  vector  at  time  t^ 

The  prior  pdf  of  the  state  vector  at  time  t^  is  the  pdf  of  x^ 
given  all  available  information  up  to  time  t^  but  excluding  the  set 
of  measurements  received  at  time  t^  .  This  available  prior  information 
at  time  t^  is  denoted  ,  and  this  includes  all  measurements 
received  at  the  previous  time  steps: 

Z!  ,  z2  ,  . . .  ,  Zk_i  . 

Since  any  one  or  none  of  the  measurements  of  could  be  true,  there 

are  exactly  nu  +  1  exclusive  hypotheses  concerning  the  truth  or 


falsehood  of  the  members  of  .  Thus  the  total  number  of  possible 


hypotheses  under  &  is: 


k-1 

vi  ■  TTk  * ') 


(2.3) 


i=1 


Therefore,  given  n^-j  possible  hypotheses,  the  pdf  of  the  state  vector 
x^  may  be  written: 


Vi 


■  L  i'^)PrK-i  il^}  •  <2- 

i=1 


4) 


Here  i  denotes  one  of  the  possible  hypotheses  on  the  measurements 

available  under  ttie  pdf  -k  asSum*-nS 

^  is  correct  and  is  given,  and  Pr i  is  the 

probability  that  <=3^-1  i  correct  given  the  information  .  In 

expression  (2.4),  the  prior  pdf  of  x^  is  written  as  the  weighted  sum 
over  all  possible  hypotheses  of  the  pdf  of  x^  conditional  on  each 
hypothesis.  The  weighting  factors  in  the  summation  are  the  corresponding 
prior  probabilities  of  each  hypothesis  being  true.  Equation  (2.4)  is 
intuitively  reasonable  and  is  sometimes  known  as  the  total  probability 
theorem. 


Now  suppose  that  the  conditional  pdfs  in  the  RHS  of  equation  (2.4) 
are  known  to  be  Gaussian,  ie 

P(^k |^k-1  i  ’  ^*k)  =  ^ (~k  ;  V  ’  \i)  ’  (2’5) 

where  x^  and  are  known,  and  „f(a;  b,  C)  denotes  a  Gaussian  pdf 

evaluated  at  a  with  mean  b  and  covariance  C.  Also  suppose  that 
the  probabilities  of  the  hypotheses  are  known  and  are  denoted: 
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prK-i  ii-n}  ■  v,  *  •  <2- 

In  this  case  equation  (2.4)  is  a  fully  specified  Gaussian  mixture  pdf 
where  each  Gaussian  component  corresponds  to  one  of  the  possible 
hypotheses.  Note  that  the  above  suppositions  are  true  for  k  =  1  , 
in  which  case,  from  the  problem  statement, 

P(-l|^l)  =  ^(-1  5  *1  *  Ml)  ’ 

which  is  a  degenerate  Gaussian  mixture  with  a  single  component. 


2.3.2  The  posterior  pdf  of  the  state  vector 

The  set  of  m^  measurements  received  at  time  t^  is  to  be 

used  to  update  the  prior  pdf  of  x^  specified  by  equations  (2.4)  to 
(2.6).  The  resulting  posterior  pdf  is  denoted: 


*{\\\  ’  ^k)  • 


In  the  following  working  we  shall  omit  for  ease  of  notarion, 

although  the  dependency  should  be  understood  for  all  conditional 
probabilities  and  pdfs.  Thus  the  posterior  pdf  of  x^  will  be 
written: 

P(*k|Zk)  • 


After  updating  with  the  latest  set  of  measurements,  the  total  number 
of  possible  hypotheses  is  increased  to: 

\-i(\  +  • 


This  increase  may  be  viewed  as  a  branching  process  where  each  of  the 
^  .  prior  hypotheses  of  equation  (2.4)  may  be  seen  as  a  potential 
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track  and  each  of  these  tracks  then  splits  into  a  further  m^+  1  tracks 
resulting  from  the  new  set  of  measurements.  Thus  a  posterior  hypothesis 
including  the  latest  set  of  measurements  may  be  written  as  a  joint 

hypothesis : 

*"iij  -  K-i  i  ■  ) 


where  41,  .  is  independent  of  .  .  and  indicates  that  the  ith 

kj  r  k-1  i  J 

measurement  of  set  is  true  (or  that  they  are  all  false  if  j  =  0) 

The  complete  set  of  posterior  hypotheses  is: 

{'^kij:  1  =  1 . nk-i;  j  =  °  »  **•  »  “k}  • 

Hence  the  posterior  of  pdf  of  may  be  written  in  the  form: 


Vi  “V 


p(^izj  ■  z  ij  •  \)p'KiiizJ  •  (2-7> 


i-=1  j  *0 


First  consider  the  posterior  pdf  of  xk  conditioned  by 

P(^k|^kij  ’  Zk) 


on 


is  the  probability  density  resulting  from  updating 
the  assumption  that  the  j th  measurement  from  is  true  (for 

j  ^  0)  .  In  this  case  is  the  only  useful  measurement  from  Z^ 

and  the  other  members  of  Z^  can  be  discarded  since  they  contain  no 
relevant  information.  A  true  measurement  z^  has  a  Gaussian 


distr  ibut ion : 


-  *  (-T  ;  H^k  ’  R)  ' 


and  the  prior  density  of  under  a^so  Gaussian,  given  by 

equation  (2.5).  Hence  the  required  posterior  density  is  also  Gaussian 


and  is  given  by  the  standard  Kalman  filter  (see  Appendix  A). 
So  for  j  ^  0  : 
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where 


'  Zk)  ‘  r(^'’  *kij  1  Pkij) 

5kij  '  Ski  *  Kti(^j  ‘  "Ski)  - 


Kki  -  PiySV’  , 

>  (2.8) 

P'  .  =  M,  •  -  M.  -hV:  HM.  . 

klj  K1  K1  kl  Kl 


ski  =  +  R  • 


If  j  =  0  ,  none  of  the  members  of  are  true  and  so  the  prior  pdf 

is  not  modified: 


*kiO  =  Ski 


PkiO  =  «ki 


(2.9) 


Now  turning  to  the  second  term  in  the  summation  of  equation  (2.7), 
the  posterior  probability  that  c^^ij  correct  My  be  evaluated 
using  Bayes  theorem: 


■  i)PrK-.  i) 

I  '  p/z,.\ 


where  p(Z^)  is  a  normalizing  constant  given  by: 


Vi  \ 

>{\)  ■  Z  Z  KzkKij)PrKjK-t  iiprK-,  i}  ■ 

i=  1  j  =U 
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The  equation  (2.10)  indicates  how  the  prior  probability  Pr  i  '/ 
is  modified  by  the  observations  at  time  t  .  The  posterior  probability 
can  be  found  by  evaluating  the  three  factors  in  the  numerator  of  the 
RHS  of  equation  (2.10). 


r  First  consider  p(^c|^’^ij^  *  Thi-S  may  be  written: 


(2.11) 


Since  the  elements  of  are  independent: 


\ 


-Kij)  -  TT  •  \i) 

4=1 


A  measurement  z^  is  false  under  if  j  t  l  .  False  measurements 

are  uniformly  distributed  over  the  surveillance  region  of  the  sensor, 
and  so  the  pdf  of  a  false  measurement  is  V^1  ,  where  is  the 

volume  of  the  surveillance  region.  If  j  =  l  ,  the  measurement  z 
is  true  and  so  is  a  sample  from  the  Gaussian  distribution  defined  by 
equation  (2.2).  The  prior  pdf  of  : 


K^Kij)  '  pN‘*t-i  i) 


which  is  the  Gaussian  pdf  (2.5).  Hence  on  substituting  into 
equation  (2.11)  we  obtain,  for  j  ^  0  : 
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where  is  defined  in  Che  relations  (2.8)  and  the  integral 

is  evaluated  in  Appendix  A. 2.  Expression  (2.12)  is  strictly  correct 
only  for  a  surveillance  region  of  infinite  extent.  However,  the 
truncation  effect  is  negligible  provided  that,  for  each  component  of 
2^  ,  the  distance  from  Hx^  to  the  boundary  of  the  surveillance 
region  is  large  compared  with  the  standard  deviation  of  the  component. 

If  j  =  0  so  all  the  measurements  are  false: 

■  'Ck  •  <2-,3) 

The  second  factor  in  the  numerator  of  (2.10)  is  the  prior 
probability  of  : 

pikj|j*vi  i}  ■  prkj} 

since  the  hypothesis  on  the  current  set  of  measurements  is  independent 
of  hypotheses  on  measurements  from  previous  time  steps.  The  only  prior 
information  available  is  the  probability  P^  of  detecting  the  target 
and  the  probability  of  the  sensor  receiving  m  false  measurements.  If 
false  measurements  are  uniformly  distributed  over  the  measurement  space 
with  density  o  ,  then  it  can  be  shown  that  the  probability  of  m  false 
measurements  falling  within  the  surveillance  region  of  the  sensor  is 
given  by  a  Poisson  distribution.  If  the  volume  of  the  surveillance 
region  is  ,  the  probability  of  receiving  m  false  measurements  is 

given  by: 


g(m) 


(2.14) 


The  hypothesis  •'  corresponds  to  the  event  of  failing  to  detect  the 
target  and  receiving  ni  false  measurements.  The  prior  probability  of 


this  occurrence  is: 
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prit4  -  (’  -  ?dM\)  • 


(2.15) 


Any  of  Che  hypotheses  4^  ,  j  t  0  ,  could  correspond  to  the  situation 
of  detecting  the  targe c  and  receiving  -  1  false  measurements. 

A  prijri,  each  of  these  hypotheses  is  equally  probable,  and  since 
there  are  m^  of  them  (for  j  ^  0)  : 


prkj}  ■  *A\  -  ')/\ 


(2.16) 


The  third  factor  in  the  numerator  of  (2.10)  is  given 
directly  by  equation  (2.6)  : 

PrR-,  i}  '  V,  i  •  <2-,7) 

Substituting  (2.12)  to  (2.17)  into  (2.10)  we  obtain: 


Pr 


Bk-1  ik(-kj  ;  n^ki  ’  Ski) 

E 

6k-1  i(>  -  Pd)P 
P^E 


for  j  ^  0 


for  j  =  0 


(2.18) 


where  E 


kr 


) 


is  the  normalizing  denominator.  This  equation  is  of  key  imj ortance 
because  it  defines  the  weightings  of  the  mixture  distribution  (2.7). 
Note  that  the  volume  of  the  surveillance  region  does  not  appear 

in  (2.18).  Also  note  that  if  PQ  =  1  ,  knowledge  of  Lne 
density  o  of  false  measurements  does  not  contribute  to  the  posterior 
pdf  . 
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Thus  the  posterior  pdf  of  x^  given  by  equation  (2.7)  is  a  fully 
specified  Gaussian  mixture.  Equation  (2.7)  can  be  rewritten  as  a 
single  sum  by  defining: 


and 


k£ 


where  i  =  (i  -  1) ^m^  +  1 
j  =  0  ,  . . .  ,  .  Thus: 


‘kti 

=  %?' 
kij 

=  $iij 

Pk£ 

Pkij 

= 

j  + 

1  ,  for 

V 


and 


K 

-  E  •  zk)PrK«!zk)  •  <2-,9) 

£=1 


K. 

where  ^  *  ')  *  |[  (\  *  ')  . 


i=1 

■  zk)  ■  ■*(*-*  '■  iu  •  v) 
and  ■ 

The  Gaussian  mixture  (2.19)  contains  all  the  available 
information  on  the  state  vector  x^  after  taking  account  of  the  latest 
set  of  measurements  .  Thus  in  principle,  the  optimal  estimate  based 

on  any  desired  criterion  may  be  obtained  from  (2.19).  This  is 


considered  in  section  2.4. 
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2.3.3  The  prior  pdf  of  the  state  vector  at  time  t. 


To  establish,  by  induction,  the  general  property  that  the  prior 

pdf  of  x^  (equation  (2.4))  is  a  fully  specified  Gaussian  mixture,  it 

is  necessary  to  derive  the  pdf  of  x,  ,  from  the  result 

-k+ 1 

(2.19).  This  pdf  may  be  derived  from  p^x^jz^  *  ^note  is 

reinstated  here)  via  the  propagation  equation  (2.1).  This  information 
together  with  and  ^  is  denoted  »  which  is  all  the 

prior  information  available  at  time  t^+^  *  The  prior  pdf  of 
may  be  written: 

=  /P(V  i|5k)p(*k|^k+1j^k  ‘  (2*20) 

p^x^+1Jx^j  is  defined  by  the  state  propagation  equations,  and  the 


second  term: 


■  p(5klzk-^k) 


since  the  extra  information  on  state  propagation  from  t^  to  t^+^ 
does  not  contribute  to  the  pdf  of  state  at  t  .  Substituting 
equation  (2.19)  into  equation  (2.20)  and  performing  the  integrations 
(see  Appendix  A. 3)  gives: 


K. 


and  p(Vi| ***■  ’  ^k+i)  =  -r(Vi  ’  ^k+i  a  ’  \+i  i) 


Vi  £ 
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and 

Vu  '  *\/  *  T^J  ■ 

The  pdf  (2,21)  is  of  the  same  form  as  equation  (2.4):  it  is 
a  fully  specified  Gaussian  mixture.  Hence  the  initial  supposition  of 
section  2.3.1  is  proved  by  induction. 

2.4  Discussion 

It  has  been  shown  that  the  posterior  pdf  of  the  target  state, 
just  after  incorporating  the  latest  set  of  measurements,  is  a  Gaussian 
mixture  given  by  equation  (2.19).  The  recursive  procedure  required  to 
obtain  this  result  is  shown  in  the  flow  diagram.  Fig  2.1.  This 
procedure  constitutes  the  optimal  tracking  filter  for  the  problem 
stated  in  section  2.2.  The  Gaussian  mixture  (2.19)  is  a 
complete  description  of  the  filter's  knowledge  of  the  target  state  at 
time  step  k  .  Each  component  of  the  mixture  represents  a  potential 
target  track  and  is  a  Kalman  filter  estimate  of  the  state  vector 
based  on  a  possible  history  of  true  and  false  measurements.  At  time 
t^  the  n^  components  represent  all  feasible  track  histories.  The 
weighting  8^  is  the  probability  that  track  history  l  is  the  correct 
one . 


The  pdf  of  target  state  contains  all  the  available  information  so 

that,  in  principle,  an  optimal  estimate  based  on  any  desired  criterion 

may  be  obtained.  For  example  the  minimum  mean  square  error  estimate  is 
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the  mean  of  the  distribution  (see  Jazwinski  ).  From  equation  (2.19) 
the  posterior  mean  of  is  given  by: 


= 


Y  v 

£  =  1 


(2.22) 
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which  is  a  weighted  sum  of  the  mean  state  vectors  corresponding  to 
each  possible  track  history.  Also  the  covariance  of  this  estimate  may 
be  obtained  from  (2.19)  (see  Appendix  B) : 

"k 

Pk  =  ^  SkJl(Pk£  +  Kl  -kz)  '  ^k  ^k  '  (2'23 

1=1 


The  mean  may  not  be  the  most  useful  estimate  for  the  state  vector  and 
in  any  case,  a  single  value  of  is  a  somewhat  inadequate  summary  of 

a  mixture  distribution,  especially  if  there  are  significant  well  spaced 
components . 


For  most  interesting  cases,  the  number  cf  components  n^  rapidly 

becomes  very  large  with  increasing  k  (see  equation  (2.19)).  This 

rapid  growth  in  the  number  of  components  may  be  viewed  as  a  branching 

process.  For  instance,  suppose  that  at  time  step  k-1  ,  the  mixture 

distribution  comprises  two  components.  So  there  are  two  feasible 

tracks  which  are  projected  forwards  to  time  step  k  .  Suppose  that  at 

this  time  two  measurements  z,  ,  and  z,  „  are  received.  There  are 

-k 1  -k2 

three  possibilities: 

n  :  z,  ,  and  z,  „  are  false  , 
ku  -kl  -k2 


Y,  .  :  z.  .  is  true  and  z.  „  is  false  , 
kl  -kl  -k2 


or 


Y,  „  :  z.  ,  is  false  and  z.  -  is  true  . 
k2  -k 1  -k2 


Thus  the  two  feasible  tracks  from  the  previous  time  step  may  each  be 
updated  three  different  ways,  giving  rise  to  six  feasible  tracks  at  time 
step  k  (see  Fig  2.2).  Since  every  component  must  be  propagated  at 
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each  time  step,  implementation  of  the  optimal  filter  is  impractical, 
and  to  proceed  approximations  must  be  imposed.  This  is  the  subject 
of  the  next  chapter. 
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Fiq  2.1  The  optimal  tracking  filter  for  the  baseline  problem 


34 


3  CONTROLLING  THE  GROWTH  OF  MIXTURE  COMPONENTS 

3 . 1  Introduction 

To  implement  the  tracking  filter  described  in  the  previous 
chapter,  it  is  essential  to  control  the  growth  of  the  number  of 
components  in  the  Gaussian  mixture  (equation  (2.19))  at  every  time  step. 
The  maximum  number  of  components  that  can  be  allowed,  depends  on  the 
computing  power  (in  terms  of  storage  and  speed  of  operation)  and  the  time 
available  to  perform  the  calculations  ot  the  filter  recursions.  The 
maximum  number  N  of  components  allowed  in  the  mixture  after 
approximation  should  be  chosen  so  that  the  probable  increase  in  the 
number  of  components  from  measurements  received  at  the  following  time 
step  is  within  the  capability  of  the  processor.  If  the  growth  exceeds 
this  capability,  the  posterior  pdf  (2.19)  may  be  truncated  in 
an  arbitrary  fashion,  rather  than  be  subject  to  a  considered 
approx imation. 

In  this  study,  control  of  the  growth  of  hypotheses  is  achieved  in 
two  stages  at  each  time  step.  Firstly  a  coarse  acceptance  test  is 
applied  (see  section  3.2),  which  rejects  any  hypothesis  that  appears 
to  be  very  unlikely,  on  the  basis  of  prior  information.  This  control 
is  applied  at  point  A  on  the  filter  flow  diagram.  Fig  2.1.  This  test 
is  computationally  inexpensive  as  the  unlikely  hypotheses  are  rejected 
before  their  corresponding  posterior  mixture  components  need  be 
evaluated.  Hopefully  the  effect  of  this  acceptance  test  on  the 
posterior  distribution  will  be  insignificant.  Since  it  is  quite  likely 
that  the  number  of  components  left  will  still  be  excessive,  further 
reduction  may  be  necessary.  This  is  applied  at  point  B  on  Fig  2.1, 
which  is  after  the  posterior  mixture  distribution  of  the  target  state 
has  been  compiled  from  all  hypotheses  that  have  passed  the  coarse 
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acceptance  test.  The  second  stage  is  to  approximate  the  mixture  dis¬ 
tribution  and  so  reduce  the  number  of  its  components  from  a  posterior 
point  of  view  (ie  after  filter  update).  To  reduce  the  number  of 
components  below  the  specified  limit,  ,  it  may  be  necessary  to 

make  significant  modifications  to  the  distribution,  and  so  careful 
consideration  should  be  given  to  the  design  of  this  approximation 
method  as  it  will  affect  filter  performance. 

What  we  require  from  a  mixture  reduction  algorithm  is  discussed 
in  section  3.3  and  reported  methods  for  such  approximations  are 
reviewed  in  section  3.4.  It  is  argued  that  these  reported  techniques 
do  not  adequately  fulfil  our  requirements  and  so  two  new  approximation 
algorithms  have  been  developed  (see  sections  3.6  and  3.7).  The 
performance  of  these  reduction  algorithms  for  a  tracking  problem  is 
assessed  by  simulation  in  the  following  chapter. 

3 . 2  Coarse  acceptance  test 

Each  component  of  the  posterior  pdf  (2.19)  of  target  state  is 
generated  by  updating  a  feasible  track  from  the  prior  pdf  with  either 
one  of  the  received  measurements  or  by  prediction  on  the  assumption  that 
all  received  measurements  are  false  (see  section  2.3).  It  is  most  con¬ 
venient  to  generate  equation  (2.19)  by  considering  each  feasible  prior 
track  in  turn,  and  evaluating  all  the  possible  posterior  tracks  which 
spring  from  that  branch.  Consider  the  prior  track,  or  component  i  of 
equation  (2.4),  that  corresponds  to  hypothesis  1  .  The  prior  pdf 

of  the  true  measurement  under  .  is  the  Gaussian  with  mean  Hx  . 

K“ 1  1  -ki 

and  covariance  S  ^  .  From  knowledge  of  this  distribution,  an  acceptance  or 
validation  region  in  the  measurement  space  can  be  defined,  such  that  under 
hypothesis  i  ’  c^e  probability  of  the  true  measurement  (if  it  is 

detected)  fall ing  outside  the  region  is  very  small.  (This  type  of  acceptance 
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test  is  commonly  applied  to  measurement-track  association  problems 
where  ambiguities  may  exist  -  see  Blackman1).  If  the  validation  region 
is  chosen  so  that  the  probability  density  of  the  true  measurement  at 
any  point  within  the  region  exceeds  that  at  all  points  outside  the 
region,  then  since  the  distribution  is  Gaussian,  the  acceptance  region 
is  the  interior  of  a  hyperellipsoid.  Thus  a  measurement  is 

accepted  for  updating  hypothesis  ^  ^  if  and  only  if: 


(z-kj  -  H-ki, 


'  )T  Ski(-kj  -  Hki)  <  r> 


(3.1) 


Note  that  since  the  false  measurements  have  a  uniform  distribution, 

this  is  equivalent  to  subjecting  each  measurement  to  a  likelihood  ratio 

test.  For  a  true  measurement  z^  ,  under  hypothesis  i  » 

2 

LHS  of  (3.1)  is  a  sample  from  a  x  distribution  with  number 


of  degrees  of  freedom  equal  to  the  dimension  of  z^j  .  So  once  the 
acceptable  probability  P^  of  missing  the  true  measurement  (if  the 
target  is  detected)  under  hypothesis  ^  has  been  chosen,  the 

2 

required  value  of  the  threshold  may  be  obtained  from  tables  of  X 


The  acceptance  test  has  been  used  in  all  the  simulations  of  this 
study.  To  avoid  any  significant  performance  degradation  as  a  result 
of  the  acceptance  test,  P^  was  set  to  the  very  small  value  of  0.001. 
This  corresponds  to  TA  =  13.82  for  two  dimensional  measurement  space. 
Also  to  take  account  of  the  possibility  of  rejecting  the  true  measure¬ 
ment,  the  detection  probability  PQ  should  be  replaced  by  P  (1-P  ) 
in  equation  (2.18).  Thus,  when  the  acceptance  test  is  employed,  even  if 
P  =  1  ,  a  component  is  generated  for  the  finite  probability  of  missing 
the  true  measurement. 
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3 . 3  Requirements  of  a  mixture  reduction  algorithm 

The  following  criteria  have  been  identified  for  the  design  of  a 
mixture  reduction  algorithm: 

(i)  The  approximation  should  result  in  another  Gaussian 
mixture.  This  is  necessary  to  allow  the  tracking  filter 
algorithm  to  be  implemented  as  a  bank  of  Kalman  filters. 

(ii)  The  algorithm  should  allow  the  maximum  number  N^,  of 
components  after  approximation  to  be  specified. 

(iii)  Whenever  possible,  reduction  should  be  achieved  without 

modifying  the  'structure’  of  the  distribution  beyond  some 
acceptable  limit.  Conversely,  to  avoid  retaining  unnecessary 
components,  reduction  should  continue  until  this  limit  is  reached, 
so  that  the  approximation  may  contain  less  than  components. 

Note  that  this  criterion  is  in  terms  of  mixture  structure 
modification  because  it  is  feasible  to  define  and  compute  such  a 
measure.  Also  it  is  likely  that  the  extent  of  modification  is 
related  to  practical  performance  measures,  such  as  the  probability 
of  losing  track,  which  cannot  be  readily  computed  as  a  function 

of  mixture  approximation. 

(iv)  Intuitively,  the  approximation  should  preserve  the  mean 
and  covariance  of  the  original  mixture.  Unfortunately,  after 
propagation  of  the  approximated  mixture  via  the  filter  update  and 
prediction  relations,  the  mean  and  covariance  of  the  updated 
mixture  will  not,  in  general,  coincide  with  those  of  the  optimal 
solut ion . 
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(v)  The  reduction  algorithm  should  be  computational  In¬ 

efficient  (reduction  must  be  accomplished  within  the  filter 
update  period),  even  when  the  original  mixture  consists  of  a 
large  number  of  components  (for  example  over  100),  each  with  a 
different  covariance  matrix. 

3 .4  Review  of  mixture  reduction  techniques 

A  number  of  techniques  for  controlling  the  growth  of  the  mixture 
distribution  have  been  reported.  The  simplest  method  is  to  reduce  the 
mixture  to  a  single  Gaussian  component  at  each  time  step,  and  the 
crudest  means  of  achieving  this  is  to  choose  that  mixture  component 
corresponding  to  the  most  probable  hypothesis.  When  the  probability 
of  detection  is  unity,  this  corresponds  to  the  nearest  neighbour 
approach,  ie  update  the  track  by  using  the  measurement  z  which 
minimizes  the  expression: 

(z  ~  HxZk)T  S~Y-  -  H^)  . 

However  this  technique  takes  no  account  of  the  possibility  that  the 
wrong  hypothesis  may  have  been  chosen  and  results  in  what  is  essentiall 
a  decision-directed  filter  (see  Bar-Shalom  ) .  A  considerable  improve¬ 
ment  on  this  method  is  the  Probabilistic  Data  Association  Filter  (PDAF) 

3  7 

or  probabilistic  editor  ,  in  which  the  single  Gaussian  approximation 
is  chosen  to  match  the  mean  and  covariance  of  the  full  posterior 
mixture  (see  Appendix  B,  equations  (B-5)  and  (B-6)).  Thus  the 
hypotheses  are  effectively  combined  and  the  uncertainty  is  recognized 
in  the  covariance  of  the  approximating  Gaussian.  The  PDAF  has  been 
promoted  principally  by  Bar-Shalom  and  it  may  be  thought  of  as  a 
lower  bound  on  the  range  of  possible  approximations  meeting  requirement 
(iv)  (the  mean  and  covariance  are  preserved);  the  upper  bound  being 
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obtained  when  all  components  are  retained.  The  PDAF  does  not  meet 
requirements  (ii)  or  (iii) .  The  filter  performs  well  in  a  number  of 
cases  (see  Ref  15)  and  is  computationally  very  economical.  However 
in  many  circumstances,  the  single  Gaussian  approximation  will  destroy 
important  structure  in  the  mixture  distribution,  especially  when  a 
number  of  well  spaced  components  are  present.  In  this  case  it  should 
be  better  to  consider  approximations  which  retain  several  components. 

Singer  st  ai ^  have  developed  an  N-scan  filter  in  which 
components  of  the  mixture  distribution  a -e  combined  according  to  the 
history  of  hypotheses.  If  several  components  result  from  updating  by 
the  same  measurements  over  the  last  N-scans,  then  the  components  are 
combined.  For  a  particular  N  ,  the  performance  of  this  method  is 
likely  to  depend  on  the  responsiveness  of  the  filter  to  incoming 
measurements.  This  in  turn  depends  on  the  covariances  Q  and  R  . 

If  the  filter  is  very  responsive,  components  with  the  same  measurement 
history  over  recent  scans  will  be  very  similar  and  the  consequent 
performance  penalty  in  combining  these  components  should  be  small. 

A  disadvantage  is  that  the  number  of  components  retained  is  not  limited 
(requirement  (ii)).  However  provided  N  is  small,  the  algorithm  should 
be  computationally  efficient:  no  measures  of  similarity  need  be 
calculated,  although  the  recent  history  of  measurement  acceptance  must 
be  stored.  For  the  simulation  example  reported  in  Ref  10,  near 
optimal  performance  is  claimed  for  only  a  single  scan  memory. 

Gaussian  mixture  distributions  with  an  increasing  number  of 

components  also  occur  in  system  switching  problems,  where  the  parameters 

of  the  system  are  subject  to  abrupt  changes  or  jumps.  Thus  approximation 

techniques  have  also  been  developed  to  implement  filters  for  these 
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problems  (see  the  survey  by  Pattipati  and  Sandell  ).  A  technique 
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known  as  the  generalized  pseudo  Bayes  algorithm  (GPBA)  has  been 

39 

developed  by  Jaffer  and  Gutpa  ,  which  is  the  equivalent  of  the  N-scan 

40 

memory  filter.  In  a  similar  vein,  Blom  has  developed  an  interacting 

multiple  model  (IMM)  algorithm  in  which  components  of  the  prior 

distribution  are  merged  before  measurement  update.  The  special  case 

of  the  GPBA  where  only  a  single  Gaussian  is  propagated  is  called  the 

pseudo  Bayes  method,  and  this  is  the  equivalent  of  the  PDAf.  The 
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pseudo  Bayes  method  was  proposed  by  Ackerson  and  Fu  ,  although  they 
omitted  the  'between  components'  contribution  to  the  covariance  of  the 
approximating  Gaussian  (see  next  section) . 

The  remainder  of  the  methods  described  in  this  section  are 
direct  approximations  of  the  posterior  mixture  distribution,  without 
reference  to  the  measurement  history,  which  allow  more  than  one  com¬ 
ponent  to  be  retained.  All  of  these  techniques  involve  merging  or 
discarding  components  of  the  mixture.  The  simplest  of  these  schemes 

is  to  retain  only  the  N  most  probable  components  at  each  time  step 
42 

(see  Tugnait  ).  A  refinement  of  this  method  suggested  in  Ref  43  is  to 
combine  components  which  are  close  in  the  sense  of  the  Bhattacharyya 
distance  measure  (see  below)  before  rejecting  components;  but  this  does 
not  appear  to  have  been  implemented. 

25  44 

Alspach  and  Lainiotis  and  Park  have  suggested  schemes  m 
which  the  mixture  is  approximated  by  merging  and  pruning  operations, 
none  of  which  exceed  a  specified  penalty  measure.  Alspach  defines  the 
penalty  of  approximating  the  mixture  p(x)  by  pA(x)  as  the 
Kolmogorov  variational  distance  between  the  two  distributions: 


A) 


Weiss,  Upadhyay  and  Tenney  also  analyse  the  penalty  of  merging  com¬ 
ponents  in  terms  of  k  .  Lainiotis  and  Park  use  a  penalty  meas-Te 
based  on  the  Bhattacharyya  coefficient  p  ,  which  is  defined  by: 


P 


p, (x)  dx 
A  -  — 


p  lies  between  zero  and  one,  and  p  =  1  if  p(x)  =  p^(x)  •  Thus 

1  -  p  is  a  measure  of  the  penalty  of  approximating  p(x)  by  p^(x)  • 
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These  distance  measures  are  related  by  (see  Kailath  ) : 


Bounds  on  these  penalty  measures  in  terms  of  the  mixture  parameters 
have  been  derived  for  deleting  a  component  and  for  merging  a  pair  of 
components  (see  Refs  25  and  44).  The  authors  suggest  that  fixed 
acceptable  penalty  levels  should  be  chosen  and  that  the  mixture  should 
be  reduced  by  merging  and  pruning  operations  which  do  not  exceed  these 
penalty  levels.  The  method  of  Lainiotis  and  Park  would  require  the 
calculation  of  the  Bhattacharyya  coefficient  between  every  pair  of 
mixture  components.  This  would  be  very  time  consuming  and  the  method 
does  not  appear  to  have  been  implemented.  The  method  of  Alspach 
assumes  that  the  covariance  of  all  components  is  the  same.  This 
situation  is  maintained  as  filtering  proceeds  by  ignoring  the  between 
component  contribution  to  the  covariance  of  the  merged  components,  and 
so  overall  covariance  is  not  preserved  with  this  method. 

The  mixture  reduction  techniques  derived  in  the  following 


sections  may  be  viewed  as  developments  of  these  direct  approximation 
methods.  The  new  algorithms,  which  are  essentially  merging  operations, 
c3ter  for  components  with  different  covariances,  and  the  maximum  number 
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of  components  after  approximation  may  be  chosen  as  required.  Also,  at 
%ach  time  step,  the  overall  mean  and  covariance  are  preserved,  save 
for  certain  insignificant  components  which  may  be  discarded.  The 
algorithms  are  based  on  the  premise  that  changes  to  the  'structure'  of 
the  mixture  should  be  minimized.  The  measure  of  structure  is  derived 
from  a  decomposition  of  the  mixture  covariance  matrix. 

3.5  Mixture  structure:  the  covariance  matrix 

Consider  any  N-component  mixture  distribution  with  pdf: 

N 

p(x)  -  ^  6iPi(x) 

i=1 


where  p^(x)  is  a  component  pdf 

and  Si  is  a  probability  associated  with  the  ith  component  such 
that : 


8.  >  0 
l 

and 


N 


i=1 


The  covariance  matrix  P  of  this  mixture  may  be  decomposed  into  two 
contributions,  W  and  B  (see  Appendix  B,  equation  (B-3)): 

P  =  W  +  B 


43 


N 

where  W  =  )  6 .  P . 

Li  i  1 

i=1 

N 

B  ■  E  h{ti  -  i){h  -  if 

i=1 

N 

x  =  )  6.x. 

L,  1 

i=1 

is  the  mean  of  the  distribution  and  x.  and  P.  are  the  mean  and 

-l  i 

covariance  of  the  ith  component.  The  matrix  W  may  be  interpreted  as 
the  contribution  from  the  covariance  'within'  each  component  of  the 
mixture  and  it  depends  on  the  spread  of  each  individual  component. 

B  may  be  interpreted  as  the  between  component  contribution  which  is 
due  to  the  separation  between  the  mixture  components.  B  and  W  are 
both  symmetric  matrices,  W  being  positive  definite  and  B  being 
positive  semidef inite . 

Suppose  that  the  mixture  distribution  is  approximated  by  merging 
several  components  together.  If  S  is  the  set  of  subscripts  of 
components  to  be  merged,  then  the  probability  mass  of  the  new  component 
is : 


6 ' 


■  £•.  • 


(3.2) 


ie(E 


To  preserve  the  overall  mean  of  the  mixture  (requirement  (iv)) 


N  V' 

Y  6i  -i  =  s'  +  Y  8i  -i  ’ 

i=1  HV 


so  that  the  mean  of  the  new  component  is  given  by: 


(3.3) 


Also  to  preserve  the  overall  covariance,  from  equation  (2.23): 


so  that  the  covariance  of  the  new  component  is  given  by: 


P' 


is® 


+  a. 


(3.4) 


Although  the  overall  covariance  P  is  unchanged,  this  merging  of 
components  results  in  a  loss  of  between  components  covariance  B  which 
is  balanced  by  an  increase  in  W  .  To  see  this,  let  W'  and  B'  be 
the  within  and  between  covariances  of  the  approximated  mixture.  Since 
overall  covariance  is  preserved: 

P  =  W+B  =  W'+B'  .  (3.5) 

Thus  the  matrix  L  defined  as: 

L  =  B  -  B'  , 


is  given  by: 
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L  =  W'  -  W 


iefl 


(3.6) 


which  is  a  positive  semi-definite  matrix.  This  shift  of  covariance 
from  B  to  W  is  a  rough  measure  of  the  change  in  the  structure  of  a 
mixture  distribution  when  components  are  combined.  (Techniques  have 
been  developed  for  Cluster  Analysis  using  a  similar  decomposition  of 
the  data  scatter  matrix  (see  Hand^).) 


3 . ft  The  Joining  Algorithm 
3.6.1  Derivation 

Ideally  the  final  partition  of  components  into  sets  for  merging, 

should  be  such  that  the  increase  in  some  cost  function  is  minimized. 

However  to  reduce  the  mixture  from  N  to  M  components,  this  could 

involve  the  evaluation  of  the  criterion  for  every  possible  partition 

to  identify  the  minimum.  Such  a  procedure  for  a  number  of  different 

values  of  M  would  be  far  too  time  consuming  and  so  a  suboptimal 

approach  has  been  adapted  from  the  agglomerative  methods  of  Cluster 
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Analysis  (see  Hand  ) .  In  this  approach,  which  we  call  the  Joining 
Algorithm,  a  pair  of  components  are  merged  at  every  iteration  of  the 
algorithm.  The  components  for  merging  are  chosen  to  minimize  the 
incease  in  the  chosen  criterion  at  each  stage.  Clearly  there  is  no 
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guarantee  that  the  final  partition  from  such  a  procedure  will  achieve 
the  smallest  possible  value  of  the  cost  function. 


To  implement  the  Joining  Algorithm  using  a  cost  function  based  on 
an  increase  in  the  within  component  covariance,  we  require  a  suitable 
scalar  measure.  From  equations  (3.3)  and  (3.6),  if  components  i  and 
j  are  merged,  the  increase  in  W  is  given  by: 


-  h){h 


(3.7) 


One  possible  measure  is  the  trace  of  ,  which  is  the  squared 

Euclidean  distance  between  the  component  means  modified  by  the  factor 

6.  B./(S.  +  6.)  .  However  this  has  the  disadvantage  that  it  is 
i  J  i  1 

dependent  on  the  scaling  of  the  elements  of  the  state  vector  and  so 
is  problem  dependent.  This  difficulty  is  avoided  by  using  the 
Mahalanobis  distance  (see  Ref  47)  to  give: 


B.  B. 

- i - J  ■  fx 

B.  +  8.  \-i 
i  J  ' 


(3.8) 


where  P  is  the  covariance  of  the  whole  mixture.  This  distance 
measure  is  related  to  L..  by: 


This  measure  is  invariant  under  all  non-singular  linear  transformations 
of  the  state  vector.  At  each  iteration  of  the  Joining  Algorithm,  the 
two  components  which  are  closest  in  the  sense  of  the  distance  measure 
equation  (3.8)  are  combined  to  form  a  new  component  defined  by 
equations  (3.2)  to  (3.4). 
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The  minimum  value  of  the  distance  measure  at  each  iteration  is 

an  indicator  of  the  change  in  distribution  structure  resulting  from 

the  merging  of  the  two  closest  components.  It  is  shown  in  Appendix  C 

that  this  minimum  distance  increases  monotonically  as  reduction  proceeds, 

and  so  each  merging  operation  increases  this  measure  of  structural 

modification.  (Distance  measures  with  this  property  are  said  to  be 

48 

not  subject  to  reversals  -  see  Anderberg  ,  page  141.)  Thus  if  a 

threshold  defining  the  acceptable  modification  to  the  distribution  is 

specified,  approximation  should  proceed  until  the  minimum  distance 

exceeds  this  threshold.  For  convenience  we  compare  the  squared 
2 

distance  d^  with  a  threshold  T  .  In  choosing  a  value  for  the 

2 

threshold  T  ,  it  is  useful  to  note  that  the  squared  distance  d^  is 
bounded.  To  see  this  (from  equations  (3.5)  and  (3.6)): 

P  =  W+B  =  W+B'+B-B' 

=  (W  +  B')  +  L. .  , 

where  P  and  W  are  positive  definite  n  *  n  matricies,  and  B'  and 
L.j  are  positive  semi-definite.  Multiply  through  by  P  '  to  give: 

I  =  P_1 (W  +  B')  +  P_1  L . .  . 

ij 


Taking  the  trace  gives: 


n 


tr^P-1  (W  +  B')j  +  trj^P 


Hence  since  P  and 


(W  +  B’) 


are  both  positive  definite. 


? 

d) .  <  n 

tj 
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Note  that  for  our  tracking  problem,  n  is  the  dimension  of  the 
state  space.  Thus  we  have  chosen  T  to  be  a  constant  fraction  of 
this  upper  bound  n  .  Simulation  studies  indicate  that  a  value  of: 

T  =  0.001  n 

retains  sufficient  components  to  give,  on  visual  inspection,  a  good 
approximation  to  the  mixture. 

At  each  iteration,  the  algorithm  determines  the  number  N  of 

remaining  components,  excluding  the  set  of  smallest  components  with 

total  probability  mass  (ie  the  sum  of  their  8  weights)  less  than 
2 

B„,  .  If  d..  exceeds  T  before  N„  has  been  reduced  below  the 
T  ij  R 

specified  maximum  ,  then  approximation  continues  beyond  the 
acceptable  limit  of  modification.  The  purpose  of  ,  which  has 
been  set  to  0.01,  is  to  avoid  wasting  effort  on  grouping  insignificant 
components.  A  flow  diagram  of  the  Joining  Algorithm  is  given  in 
Fig  3.1. 

3.6.2  An  example  of  mixture  reduction  with  the  Joining  Algorithm 

The  Joining  Algorithm  has  been  applied  to  a  four-dimensional 
Gaussian  mixture  distribution  taken  from  the  tracking  simulation  of 
Chapter  4.  For  illustration,  the  distribution  is  only  shown  as  a 
function  of  two  dimensions  x  and  y  ,  which  are  the  Cartesian 
co-ordinates  of  the  target  position.  Fig  3.2  gives  a  perspective  view 
of  the  pdf  of  the  original  mixture,  while  in  Fig  3.3  it  is  shown  as  a 
contour  plot  with  logarithmic  contour  spacing  to  bring  out  the  shape 
of  the  smaller  components.  The  distribution  is  composed  of  37 


c  omponents . 
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The  final  partition  of  components  Droduced  by  the  Joining 
Algorithm  with  N^,  =  10  is  shown  in  Fig  3.4.  In  this  figure  the  means 
of  the  original  components  are  plotted  as  numbers  which  denote  the 
final  component  to  which  the  original  is  assigned.  The  final  components 
are  ordered  according  to  decreasing  probability  mass,  so  component 
number  1  has  the  largest  6  weight.  The  original  components  are  colour 
coded  according  to  their  8  weights  as  indicated  on  the  diagrams.  The 
actual  position  of  the  target  is  also  shown;  it  is  close  to  the  means 
of  two  of  the  larger  original  components.  Note  that  after  reduction 
the  maximum  permitted  number  of  components,  ve  ten,  has  been  retained, 
indicating  that  the  minimum  squared  distance  measure  has  exceeded  the 
acceptable  modification  threshold  T  .  The  grouping  of  components 
shown  in  Fig  3.4  appears  to  be  consistent  with  maintaining,  as  far  as 
possible,  the  structure  of  the  distribution,  although  it  should  be  noted 
that  the  distribution  is  four-dimensional  and  only  two  of  these 
dimensions  are  shown  here.  The  mixture  approximation  corresponding 
to  this  partition  is  shown  in  Fig  3.5;  it  appears  to  be  an  excellent 
approximation  of  the  original  (Fig  3.2). 

If  N  is  reduced  to  four,  the  components  are  further  merged  to 
produce  the  partition  shown  in  Fig  3  6.  Here  the  original  central 
concentration  of  components  has  been  split  into  three  groups,  of  which 
number  3  includes  one  of  the  less  significant  remote  concentrations. 

The  mixture  approximation  for  =  4  is  shown  in  Fig  3.7.  Comparing 

this  with  Fig  3.2,  it  can  be  seen  that  the  original  mixture  has  been 
significantly  modified. 

The  history  of  how  components  are  merged  together  for  this  example 
is  illustrated  by  the  tree  diagram  of  Fig  3.8.  The  mean  (x  and  y 
elements  only)  and  the  2  weight  of  each  of  the  original  mixture 
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components  are  listed  on  the  left  hand  side  of  this  diagram.  The  tree 
structure  which  grows  from  these  components  indicates  which  components 
were  merged  together  and  at  what  joining  distance*  this  occurred. 

Since  the  joining  distance  always  increases  (as  shown  in  Appendix  C) , 
the  sequence  in  which  components  were  combined  is  the  same  as  the 
ordering  of  the  merging  from  left  to  right  in  the  diagram.  For  this 
reason  it  is  always  possible  to  arrange  the  original  components  so  that 
none  of  the  branches  of  the  tree  cross  one  another.  Note  that  the 
joining  distance  is  plotted  on  a  logarithmic  scale  and  that  all  com¬ 
ponents  with  B  weights  less  than  0.001  have  been  merged  at  least 

.  2  -4 

once  before  the  joining  distance  has  risen  above  d  =  5  »  10  ,  which 

....  2 

is  only  0.0125%  of  the  maximum  possible  joining  distance  d  =  4  . 

In  this  example  the  mixture  could  be  reduced  to  17  components 

without  exceeding  the  joining  distance  threshold  T  =  0.004  ,  but  to 

achieve  a  reduction  to  10  components,  the  final  joining  distance 
2 

was  d  =  0.028  .  The  numbering  of  the  branches  at  this  stage 

on  the  diagram  corresponds  to  the  cluster  numbers  of  Fig  3.4,  so  the 

clusters  are  numbered  according  to  decreasing  probability  mass.  To 

further  reduce  the  mixture  to  only  four  components,  the  joining 

distance  increased  to  about  0.3,  which  is  75T.  The  branch  numbers  at 

this  stage  corresponds  to  the  cluster  numbers  of  Fig  3.6.  If  merging 

continues  until  only  one  component  remains,  the  single  Gaussian  PDAF 

approximation  of  the  mixture  is  produced.  The  final  merging  is  at  a 
2 

distance  d  =  0.703  ,  within  the  theoretical  limit  of  4.  (Further 
examples  of  the  final  joining  distance  for  N^  =  10  and  N  =  4  are 
given  in  the  next  section.) 


*  We  loosely  refer  to  joining  distance  although  this  is  actually  the 

squared  measure  dr.  . 
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3.6.3  The  control  of  mixture  components  for  a  tracking  example 


The  Joining  Algorithm  (in  conjunction  with  the  coarse  acceptance 
test)  has  been  used  to  control  the  growth  of  mixture  components  in  a 
simulation  of  target  tracking  using  the  filter  described  in  Chapter  2. 
The  tracking  problem  is  specified  in  Chapter  4. 

In  Fig  3.9  the  number  of  mixture  components  before  and  after 
reduction  by  the  Joining  Algorithm  is  shown  for  each  time  step  during 
the  tracking  operation.  Between  time  steps,  the  number  of  components 
increases  according  to  the  number  of  measurements  passed  by  the  coarse 
acceptance  test.  Also  shown  is  the  final  joining  distance  at  each 
time  step.  For  this  example  the  threshold  T  was  exceeded  on  42%  of 
the  time  steps  to  achieve  an  acceptable  reduction  specified  by  =  10 
Note  that  when  the  final  joining  distance  is  below  T  ,  the  number  of 
components  in  the  reduced  mixture  is  usually  less  than  .  When 
is  reduced  to  4  (Fig  3.10),  the  final  joining  distance  is  almost 
always  greater  than  T  ,  and  on  average  is  about  ten  times  larger  than 
the  average  final  joining  distance  for  =  10  . 

3. 7  The  Clustering  Algorithm 

3.7.1  Derivation 

The  second  algorithm  is  based  on  the  proposition  that  the  mixture 
components  with  the  largest  6  weightings  carry  the  most  important 
information.  Thus  starting  with  the  largest  component,  this  algorithm 
gathers  in  all  surrounding  components  that  are  in  some  sense  close  to 
the  principal  component.  Subsequently  the  largest  component  of  the 
remainder  is  selected  and  the  process  is  repeated  until  all  the  com¬ 
ponents  have  been  clustered.  This  is  called  the  Clustering  Algorithm. 
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The  distance  measure  chosen  to  represent  the  closeness  of  com¬ 
ponent  i  to  the  cluster  centre  is  defined  by: 


D2 

l 


6.  8 


i  c 


8.  +  6 

l  c 


(3.9) 


where  8c  »  and  are  the  probability  mass,  mean  and  covariance 

of  the  principal  component,  and  8^  and  x^  are  the  probability  mass 

and  mean  of  the  ith  component.  This  is  the  same  as  the  distance 

2  .  .  . 
measure  d„  of  the  Joining  Algorithm,  except  that  the  distance  is 

normalized  to  the  covariance  of  the  cluster  centre  rather  than  the 

complete  mixture.  Indeed  equation  (3.7)  is  the  motivation  for  the 
2  2 

definition  of  .  Note  that  is  independent  of  the  covariances 

of  components  being  tested  for  clustering  and  that  the  selection  of 

components  for  each  cluster  only  involves  the  inversion  of  one 

2 

symmetric  matrix  P^  .  Any  component  i  for  which  <  T^  is 
selected  as  a  cluster  member.  The  threshold  T^  defines  the 
acceptable  modification  to  the  distribution. 


In  choosing  T^ 
defined  by: 


it  is  helpful  to  first  consider  the  measure 


If  the  criterion  for  clustering  a  component  i  were  D!  <  T^  , 
then  any  component  i  whose  mean  were  to  fall  within  the  hyperellipsoid 
defined  by  T|  would  be  clustered.  This  hyperellipsoid  is  a  contour 
of  constant  probability  density  of  the  prinicipal  component  and  the 
proportion  of  probability  mass  enclosed  is  a  measure  of  the  selectivity 
of  the  clustering  operation.  If  Tj  were  chosen  so  that  only  a  small 
proportion,  say  1%,  of  the  probability  mass  of  the  cluster  centre  were 


enclosed,  then  the  structure  of  the  distribution  should  be  little 

*  2 

altered  by  clustering.  However  is  independent  of  the  probability 

mass  6^  of  the  component,  and  intuitively,  merging  a  large  component 
would  have  a  greater  effect  on  the  mixture  than  merging  a  small  com¬ 
ponent.  The  modifying  factor  6^  8^/(3^  +  biases  this  distance 

so  that  small  components  are  more  easily  clustered  while  large  components 
retain  their  individuality.  It  is  suggested  that  the  threshold  for: 


,  8.  8  ,, 

D2  =  — i — £_  D.2 

l  6.  +  8  i 


should  be  chosen  so  that  small  components  with  8  weights  less  than 

0.05  are  more  readily  clustered,  while  components  with  8  weights 

exceeding  0.05  are  clustered  less  readily.  Fig  3.11  shows  that  the 

contour  B.  S  /(S.  +  B  )  =  0.05  is  close  to  the  line  g.  =  0,05  inside 
i  c  1  c  1 

the  region  of  interest,  except  when  B.  is  nearly  equal  to  8  .  Thus 

it  is  suggested  that  to  give  a  good  mixture  approximation,  the  threshold 
2 

for  D.  saould  be  set  to: 

l 

T  =  0.05TJ  , 


where  T|  defines  the  hyperellipsoid  containing  only  1%  of  the 

2 

probability  mass.  (T^  can  be  found  from  tables  of  x  with  the 
number  of  degrees  of  freedom  equal  to  the  dimension  of  the  statespace.) 

Each  cluster  of  components  (some  clusters  may  consist  of  a  single 
component)  is  approximated  by  a  single  Gaussian  defined  by  equations 
(3.2)  to  (3.4).  Clustering  proceeds  until  the  probability  mass  of  the 
unclustered  components  is  less  than  BT  .  As  for  the  Joining  Algorithm, 
the  purpose  of  B^,  ,  which  is  set  to  0.01,  is  to  avoid  wasting  effort 
on  clustering  insignificant  components.  If  the  number  of  clusters  is 
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less  than  or  equal  to  ,  the  unclustered  components  are  deleted 
and  approximation  is  complete;  otherwise  further  reduction  is 
necessary.  This  is  achieved  by  repeating  the  clustering  procedure  on 
the  first  approximation  including  the  unclustered  components,  buf  with 
the  clustering  threshold  incremented  by  AT  ,  ie  T  =  T  +  AT  .  This 
clustering  operation  is  iterated  until  the  necessary  reduction  has  been 
effected.  The  choice  of  the  increment  AT  is  a  compromise  between 
the  number  of  iterations  required  and  the  possibility  of  clustering 
more  components  than  necessary.  In  this  study,  the  value  of  AT  is 
f ixed: 


AT  =  0.05  AT’  , 

where  T'  +  AT’  defines  the  hyperellipsoid  which  contains  6%  of  the 
probability  mass  of  the  principal  component.  Simulation  work  has  shown 
this  to  be  a  reasonable  compromise.  For  the  simulation  examples  of 

2 

this  study,  the  statespace  is  four-dimensional,  so  from  tables  of  x 
the  algorithm  thresholds  have  been  set  to: 

T  =  0.01485 

and 

AT  =  0.02065  . 

Although  AT  is  normally  fixed,  an  override  is  provided  which  may 
increase  the  clustering  threshold  further  to  ensure  that  at  least  one 
component  is  clustered  on  each  iteration.  This  mechanism  is  shown  in 
the  flow  diagram  of  the  algorithm  given  in  Fig  3.12. 

3.7.2  An  example  of  mixture  reduction  with  the  Clustering 
Algor i thm 

The  Clustering  Algorithm  has  been  applied  to  the  same  four¬ 
dimensional  Gaussian  mixture  distribution  that  was  used  to  demonstrate 
the  operation  of  the  Joining  Algorithm  (see  Figs  3.2  and  3.3). 
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For  N  =  10  ,  the  final  partition  is  shown  i-n  Fig  3.13  and  the 
corresponding  mixture  approximation  is  shown  in  Fig  3.14.  The 
approximation  consists  of  nine  components,  although  several  algorithm 
iterations  were  required;  ie  the  acceptable  modification  limit  was 
exceeded.  The  composition  of  the  final  clusters  is  similar  to  the 
grouping  produced  by  the  Joining  Algorithm  (see  Fig  3.4),  although 
there  are  detailed  differences.  Also  the  mixture  approximation  is 
very  similar  to  that  produced  by  the  Joining  Algorithm  (see  Fig  3.5), 
and  appears  to  be  an  excellent  approximation  of  the  original  (see 
Fig  3.2). 

The  partition  of  components  and  the  mixture  approximation 
produced  by  the  Clustering  Algorithm  with  =  4  are  shown  in 
Figs  3.15  and  3.16.  The  partition  of  the  components  is  very  similar 
to  that  of  the  Joining  Algorithm  with  =  4  (see  Fig  3.6),  the 
difference  being  the  assignment  of  three  components  with  6  weights 
below  0.001  and  one  component  with  0.01  <  B  <  0.1  It  is  chiefly 
this  one  component  which  accounts  for  the  obvious  difference  between 
the  Clustering  Algorithm  approximation  and  the  Joining  Algorithm 
approximation  (Fig  3.7)  -  also  see  the  contour  plot  Fig  3.17.  These 
approximations  are  significantly  different  from  the  original  (Fig  3.2). 

3.7.3  The  control  of  mixture  components  for  a  tracking  example 

The  Clustering  Algorithm  has  been  applied  to  mixtures  generated 
by  the  same  tracking  example  as  used  to  exercise  the  Joining  Algorithm  in 
section  3.6.3.  Fig  3.18  shows  the  number  of  components  before  and 
after  reduction,  the  maximum  clustering  distance*  threshold,  and  the 
number  of  algorithm  iterations  for  each  time  step  with  =  10  . 

*  We  loosely  refer  to  clustering  distance  although  this  is  actually  the 
squared  measure  Df  . 
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Adequate  reduction  to  within  ten  components  is  achieved  with  a  single 
algorithm  iteration  (that  is  with  the  threshold  T  )  on  72%  of  the 
time  steps,  and  no  more  than  five  iterations  are  ever  required.  Also 
comparing  the  plot  of  threshold  value  with  the  plot  of  the  number  of 
iterations,  it  can  be  seen  that  the  override  mechanism  for  increasing 
the  threshold  value  by  a  jump  in  excess  of  AT  has  only  been  invoked 
on  three  time  steps.  When  adequate  reduction  cannot  be  achieved 
without  increasing  the  threshold  above  ,  the  number  of  components 

in  the  reduced  mixture  is  never  less  than  N  -  3  ,  showing  that  the 
algorithm  did  not  merge  many  more  components  than  necessary.  Finally 
note  that  the  plot  of  the  number  of  components  before  and  after 
reduction  is  similar  to  the  corresponding  plot  for  the  Joining  Algorithm 
(see  Fig  3.9).  Also  the  plots  of  the  maximum  joining  distance  and  the 
clustering  threshold  show  similarities. 

Fig  3.19  shows  the  management  of  mixture  components  for  =  4  . 
The  work  load  of  the  Clustering  Algorithm  is  considerably  increased  for 
this  smaller  value  of  .  Adequate  reduction  with  a  single  iteration 
is  achieved  on  only  22%  of  occassions,  and  a  maximum  of  tl  iterations 
were  required  for  one  time  step.  However  the  override  facility  for 
increasing  the  threshold  level  was  frequently  employed,  and  without 
this  feature  the  maximum  number  of  iterations  would  have  been  close 
to  100.  There  is  some  similarity  between  the  plot  of  number  of  com¬ 
ponents  before  and  after  reduction  and  the  corresponding  plot  for  the 
Joining  Algorithm  shown  in  Fig  3.10.  However  the  match  is  not  so  good 
as  for  N  =  10  ,  showing  that  for  small  the  number  of  components 

is  more  sensitive  to  the  reduction  algorithm  employed.  This  is  probably 
because  significant  components  have  to  be  merged  to  achieve  the 
necessary  reduction. 
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3.8  Conclusions 

Two  new  mixture  reduction  algorithms  have  been  developed  to  meet 
a  set  of  requirements  for  Bayesian  tracking  filters.  These  algorithms 
have  been  derived  from  the  principle  that  the  increase  in  the  within 
component  covariance  should  be  minimized  when  components  are  merged. 

When  applied  to  a  Gaussian  mixture  distribution  from  a  tracking  example, 
excellent  approximations  can  be  achieved  provided  the  number  of  N 
components  allowed  in  the  approximation  does  not  force  significant 
distinct  components  to  merge.  For  small  values  of  NT  ,  the  approxi¬ 
mations  produced  by  the  two  algorithms  were  clearly  different  and  some 
features  of  the  original  distribution  were  obviously  blurred. 

In  Appendix  D  the  computational  requirements  of  the  two  reduction 
algorithms  are  analysed.  It  is  shown  that  if  the  number  of  components 
before  reduction  is  large  compared  with  that  after  reduction,  the 
number  of  operations  required  by  the  Joining  Algorithm  lies  between 
the  lower  and  upper  bounds  of  the  operation  count  for  the  Clustering 
Algorithm.  Also  for  the  Joining  Algorithm  a  large  distance  matrix 
must  be  stored,  while  for  the  Clustering  Algorithm  storage  requirements 
over  those  necessary  to  hold  the  mixture  components  are  negligible. 

In  the  following  chapter  we  compare  the  performance  and  the  computation 
time  of  the  two  algorithms  and  the  PDAF  for  an  example  of  the  baseline 
problem. 
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Fig  3.12  Flow  diagram  of  the  Clustering  Algorithm 
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Fig  3.16  The  approximated  mixture  pdf  produced  by  the  Clustering  Algorithm 
with  NT  =  4  (4  components) 
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4  PERFORMANCE  COMPARISON  OF  THE  JAF  WITH  THE  CAF  AND  THE  EFFECT  OF 

VARYING  N_ 

_ T 

4 . 1  Introduction 

Simulation  studies  are  essential  for  assessing  the  performance  of 
tracking  filters  employing  the  mixture  reduction  algorithms  described  in 
the  previous  chapter.  Since  tracking  is  a  statistical  operation  it  is 
necessary  to  carry  out  Monte  Carlo  simulation  runs  to  obtain  estimates 
of  filter  performance.  Performance  has  been  assessed  for  an  example  of 
the  baseline  problem:  the  tracking  of  a  target  moving  in  a  plane.  The 
Bayesian  solution  of  Chapter  2  has  been  programmed  for  the  example,  and 
the  approximation  techniques  of  Chapter  3  have  been  included  to  produce 
a  Joining  Algorithm  Filter  (JAF)  and  a  Clustering  Algorithm  Filter  (CAF) . 
These  filters  both  employ  a  coarse  acceptance  test  (see  section  3.2)  and, 
save  for  the  reduction  technique,  they  are  identical.  Also  for  com¬ 
parison  the  single  Gaussian  approximation  PDAF  has  been  programmed. 

The  main  objective  of  the  simulations  in  this  chapter  is  to  compare 
the  performance  of  the  filters  and  to  examine  the  effect  of  varying  the 
maximum  number  N^  of  components  allowed  in  the  approximation.  This 
has  been  examined  for  a  single  set  of  problem  parameters,  chosen  at  a 
point  in  the  space  where  the  JAF  and  the  CAF  outperform  the  PDAF.  The 
variation  of  performance  over  the  problem  parameter  space  for  fixed 
reduction  algorithm  parameters  is  assessed  in  Chapter  5.  In  all  of  these 
simulations,  the  generated  target  trajectories  and  the  statistics  of  the 
simulated  measurements  are  perfectly  matched  to  the  filter  parameters. 
Clearly  in  real  life  this  is  unlikely  to  be  the  case.  In  Chapter  6, 
filter  performance  for  data  statistics  mismatched  ,o  filter  parameters 
is  assessed  for  a  similar  tracking  problem.  Also  tracking  performance 
against  some  'realistic'  trajectories  is  investigated. 
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4 . 2  The  tracking  problem 

Target  trajectories  have  been  simulated  using  a  second  order  model 

49  50 

which  is  the  basis  of  the  a-8  filter  ’  .  This  model  has  been  widely 

used  in  tracking  problems  as  it  is  simple,  while  providing  an  adequate 
trajectory  representation  for  many  practical  cases.  The  trajectory 
described  by  the  model  is  a  variation  about  a  constant  velocity  course, 
whose  magnitude  and  direction  are  defined  by  initial  conditions.  The 
deviation  from  this  mean  course  is  controlled  by  the  variance  q  of  the 
model  driving  noise.  The  second  order  model  is  defined  by  the  following 
equation: 
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where  the  state  vector  x^  represents  the  position  and  velocity  of  the 
target  at  time  kAt  : 


T 

*  (x,  x,  y,  y)k 


At  is  the  time  step  between  measurements,  and  w^  is  a  2  x  1  vector 
from  a  Gaussian  random  sequence  with  zero  mean  and  constant  covariance: 


Q  = 


0 


9 


79 


Thus,  to  generate  a  trajectory  x^  ,  Gaussian  random  numbers  of 
variance  q  were  fed  through* the  recurrence  relation  (4.1)  , 
starting  from  some  initial  condition  x^  .  Note  that  the  target 
velocity  described  by  equation  (4.1)  is  a  random  walk. 


At  each  time  step  k  ,  a  set  of  Cartesian  position  measurements 
have  been  generated  to  simulate  sensor  measurements.  This  set  consists 
of  at  most  one  true  measurement  plus  uniformly  distributed  false 
measurements.  The  probability  of  a  true  measurement  occurring  is  the 
detection  probability  .  A  true  measurement  is  a  Gaussian 

perturbation  about  the  target  position  and  it  is  generated  from  the 
state  vector  x^  using  the  equation: 


(4.2) 


where  v,  is  a  2  x  1  vector  of  Gaussian  measurement  noise  with  zero 
-k 

mean  and  constant  covariance: 
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The  false  measurements  are  independent  of  the  target  and  are  uniformly 
distributed  over  the  sensor  surveillance  region,  with  density  p  per 
unit  area.  At  each  time  step,  the  surveillance  region  of  the  sensor  is 
arranged  to  be  sufficiently  extensive  to  include  the  target  position 
and  the  acceptance  regions  of  the  filters,  while  track  is  maintained. 
False  measurements  were  simulated  by  generating  A,p  pairs  of  uniformly 
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distributed  random  numbers  with  appropriate  scaling;  being  the 

area  of  the  surveillance  region  at  time  step  k  . 

At  each  time  step,  every  simulated  measurement  is  passed  to  the 
tracking  filters  which  attempt  to  estimate  the  current  target  state 
vector.  The  following  information  is  available  to  the  filters: 

(i)  the  value  of  the  initial  state  vector  x^  ,  so  the 
initial  position  c.nd  velocity  of  the  target  is  known  perfectly, 

(ii)  the  model  of  target  motion,  equation  (4.1), 

(iii)  the  relationship  between  the  state  vector  and  the  true 
measurement,  equation  (4.2), 

(iv)  the  statistics  of  the  false  measurements  (density  p) , 
the  true  measurement  noise  (variance  r) ,  and  the  model 
driving  noise  (variance  q) , 

(v)  the  detection  probability  of  the  sensor. 

The  tracking  filters  do  not  know: 

(a)  the  values  of  the  state  vector  x^  ,  or  the  noise  vectors 

v  and  w,  at  each  time  step, 

K  ""iC 

(b)  the  identity  of  the  true  measurement. 

Clearly  this  is  an  example  of  the  tracking  problem  given  in 
section  2.2  and  so  the  Bayesian  solution  of  Chapter  2  may  be  directly 
applied . 
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4.3  Parameters  of  the  problem 


To  analyse  this  tracking  problem  it  is  convenient  to  normalize 
the  variables  so  that  the  unit  of  time  is  At  and  the  unit  of  distance 
is  Jr  .  Then  the  non-dimensional  form  of  the  state  vector  is: 


.  =  ( 2L  ^Aty 

^  W  *  *  * 


If  the  target  model  and  measurement  equations  are  written  in  the 
normalized  form,  it  can  be  shown  that  the  statistics  of  the  problem  are 
completely  specified  by  three  non-dimensional  parameters: 


the  ratio  which  determines  the  values  of  the  filter  gains  for  the 
standard  a-8  filter,  ie  in  the  absence  of  false  measurements.  As 
this  parameter  increases  the  o-8  filter  becomes  more  responsive  to 


position  measurements. 


(ii)  pr  ,  the  expected  number  of  false  measurements  falling  within 
a  square  whose  side  is  one  standard  deviation  of  the  measurement  error. 


(iii)  PQ  ,  the  detection  probability. 


Since  the  initial  state  vector  is  assumed  to  be  known  perfectly,  the 
filter  performance  in  normalized  co-ordinates  should  only  depend  on 
these  three  parameters.  (This  is  because  the  problem  may  be  written  as 
the  estimation  of  the  deviation  about  the  nominal  constantly  velocity 
course  defined  by  the  initial  state  vector.) 
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The  filter  performance  comparisons  reported  in  the  chapter  are 
for  a  single  point  in  the  parameter  space: 


r 


pr  =  0.012 


and 


P 


D 


1 


These  values  have  been  chosen  to  illustrate  the  possible  improvement  in 
tracking  performance  of  the  new  reduction  algorithms  over  the  PDAF. 

A  full  investigation  of  filter  performance  over  the  parameter  space  is 
reported  in  Chapter  5.  For  the  above  parameters,  the  equivalent  Kalman 
filter  (receiving  only  true  measurements)  rapidly  reaches  steady  state 
conditions,  and  the  standard  deviation  of  the  position  error  on  one  of 
the  co-ordinates  approaches  within  1%  of  its  final  steady  state  value 
after  only  four  time  steps.  Also  the  expected  numlcr  of  false  measure¬ 
ments  that  would  be  received  by  an  acceptance  gate  with  =  0.001 

based  on  steady  state  Kalman  filter  covariances  i?  2.084  (see  section 
3.2).  In  the  simulations,  the  initial  target  position  was  taken  as  the 
origin,  the  initial  speed  was  lOi/r/At  and  the  initial  heading  was 
chosen  randomly  from  a  uniform  distribution  over  [0,2ttJ  for  each 
replication.  As  noted, init ial  target  position  and  velocity  do  not 
affect  the  filter  performance. 

4 . 4  Track  loss  criterion  and  simulation  program 

The  performance  of  the  filters  was  assessed  by  measuring  how  long 
they  were  able  to  maintain  track  on  the  target,  is  the  track  lifetime. 
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Each  filter  was  allowed  to  continue  tracking  the  target  until  track  was 
lost.  A  track  was  deemed  to  be  lost  if  either  of  the  following  criteria 
were  satisfied: 

(i)  The  true  measurement  is  rejected  by  the  acceptance  test  for 
five  consecutive  time  steps. 


(ii) 

or 


!*k  *  >  10  CTxk 


>  10  o, 

yk 


for  five  consecutive  time  steps,  where  (x^  ,  f  )  is  the  filter  estimate 
(the  mean  of  the  posterior  distribution)  of  the  target  position  at  time 
step  k  ,  (x^  ,  yfc)  is  the  actual  target  position  at  time  step  k  ,  and 
and  <7  ^  are  the  standard  deviations  of  the  position  estimates  of 
the  equivalent  Kalman  filter  ( ie  the  optimal  filter  for  the  same  problem 
but  with  p  =  0  )  . 


These  track  loss  criteria  are  testing  for  consistent  rejection  of 
Che  true  measurement,  or  a  tracking  error  which  is  consistently  large 
in  comparison  with  the  expected  error  of  the  equivalent  Kalman  filter; 
consistent  oeing  defined  as  five  time  steps  and  large  being  defined  as 
ten  standard  deviations. 


One  hundred  target  trajectories  with  associated  measurements  were 
generated,  so  that  the  mean  track  lifetime  and  the  distribution  of 
lifetimes  could  be  estimated.  The  same  hundred  trajectories  and 
measurement  sets  jeri  used  for  each  filter  at  each  setting  of  , 

which  was  varied  between  1  and  30. 
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In  practice,  to  avoid  storing  and  reading  large  amounts  of  data, 
£he  trajectory  and  measurements  were  generated  as  they  were  required  by 
the  filter  at  each  time  step;  data  generation  and  filtering  were 
performed  within  a  single  computer  program.  The  track  loss  test  and 
other  assessment  operations  were  also  performed  within  this  program, 
which  was  used  for  the  CAF/JAF  comparison  of  this  chapter  and  to 
produce  results  for  Chapter  5.  The  program  includes  two  tracking 
filters:  the  Bayesian  filter  of  Chapter  2  and  the  PDAF  which  provides 
a  useful  baseline  for  comparison.  The  Bayesian  filter  may  be  run  with 
either  the  Joining  Algorithm  to  give  the  JAF  or  with  the  Clustering 
Algorithm  to  give  the  CAF. 

All  computer  programs  were  written  in  Fortran  77  and  the  filter 
simulations  were  run  on  the  Cray  IS  at  RAE  Famborough.  Thus  where 
cpu  times  are  quoted,  they  are  for  this  Cray  computer.  Due  to  the 
structure  of  the  algorithms,  the  'vector'  processing  capabilities  of 
the  Cray  were  hardly  used. 

4 . 5  Resul ts 

4.5.1  Average  number  of  time  steps  to  track  loss 

Fig  4.1  shows  the  average  number  N  ^  of  time  steps  until  track 
loss  as  a  function  of  N  ,  for  filters  using  the  Clustering  Algorithm 
and  the  Joining  Algorithm  with  thresholds  set  to  the  values  given  in 
■ections  3.6.1  and  3.7.1.  =  1  corresponds  to  the  special  case  of 

the  PDAF,  and  clearly  the  filters  which  retain  more  than  one  mixture 
component  perform  better  than  the  PDAF  for  this  example.  The 
Joining  Algorithm  filter  gives  slightly  larger  values  of  than 

the  Clustering  Algorithm,  possibly  due  to  the  setting  of  the  thresholds 
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Also  shown  in  Fig  4.1  is  the  filter  performance  for  the  JAF  with 
T  =  0  ,  ie  with  the  acceptable  modification  check  switched  off.  Note 
that  the  original  setting  of  T  for  the  JAF  does  not  significantly 
degrade  the  filter's  performance,  and  the  the  performance  for  all  three 
cases  shown  in  Fig  4.1  is  similar.  For  N_  <  10  ,  N.,_  rises 

approximately  linearly  with  ,  while  for  N^  >  10  ,  NAVE  is  nearly 
constant.  Thus,  for  this  example,  =  10  appears  to  be  about  the 
critical  level  below  which  tracking  performance  begins  to  degrade. 

(The  mechanism  of  track  estimation  is  discussed  in  Chapter  5.)  For 
the  JAF  with  T  =  0  and  N  very  large, the  mixture  is  not  subject  to 
approximation,  and  so  this  constant  level  is  the  optimal  value  of 

nave  * 

Fig  4.2  shows  the  average  number  of  mixture  components  before 
and  after  reduction  for  the  three  cases  of  Fig  4.1.  Comparing  Fig 
4.2a&b  with  4.2c,  the  effect  of  the  acceptable  modification  check, 
defined  by  T^  or  T  ,  in  regulating  the  number  of  components  for  the 
large  values  of  N^,  is  obvious.  For  small  values  of  ,  the 

approximation  for  all  three  cases  is  principally  controlled  by 
itself.  For  this  example,  T^  and  T  become  the  main  regulators  of 
the  approximation  at  about  N  =  10  ,  so  the  acceptable  modification 
check  appears  to  select  the  minimum  nu  of  components  for  near  optimal 

performance.  Clearly  this  cannot  be  guaranteed  for  other  tracking 
problems,  but  since  the  thresholds  were  not  specially  tuned  for  this 
simulation,  the  performance  with  other  problems  may  not  be  far  from 
optimal . 

4.5.2  Distribution  of  number  of  time  steps  to  track  loss 

In  the  previous  section,  the  average  track  lifetime  was  discussed. 
In  this  section  we  consider  the  distribution  of  track  lifetimes  about 
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this  mean.  To  illustrate  the  distribution  and  to  compare  the  performance 
of  the  CAF  and  JAF  for  individual  replications,  the  track  maintainance 
times  have  been  plotted  in  Figs  4.3,  4.4  and  4.5  for  NT  =  2,  4  and  30 
respectively.  In  these  diagrams  each  point  corresponds  to  a  single 
replication,  and  the  X  and  Y  co-ordinates  of  the  point  are  the 
time  steps  at  which  the  JAF  and  CAF  (with  original  threshold  settings) 
lost  track  respectively.  So  points  falling  on  the  X  =  Y  line  indicate 
that  both  filters  lost  track  coincidently .  For  large  values  of  NT 
{eg  =  30  ,  Fig  4.5),  the  performance  of  the  two  filters  is 
remarkably  similar  for  the  majority  of  replications.  The  few  replications 
biasing  N.  in  favour  of  the  JAF  are  obvious.  For  small  values  of 
Nt  ( eg  NT  =  2  ,  Fig  4.3),  the  points  are  scattered  further  from  X  =  Y 
although  nave  is  almost  identical  for  the  two  filters.  These  results 
bear  out  the  observation  that  the  mixture  approximations  produced  by  the 
two  reduction  algorithms  are  usually  very  similar  for  large  N^,  ,  while 
for  small  there  are  often  clear  differences. 

Figs  4.6  and  4.7  show  histograms  of  the  data  points  from  Figs  4.3 
and  4.5;  ie  for  the  track  lifetimes  for  the  JAF  and  CAF  with  N  =  2 
and  N^,  =  30  .  It  can  be  seen  that  those  track  lifetimes  exceeding 
20  time  steps  can  be  well  fitted  by  an  exponential  distribution  of 
the  form: 


p(t) 


0 


for  t  >  t  . 

mm 


otherwise  , 


here  (t  +  a)  is  the  average  lifetime  of  tracks  which  survive  fcr 
min 

at  least  t  .  =20  time  steps.  This  is  confirmed  by  a  x 

min 


test : 
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the  exponential  hypothesis  is  only  once  rejected  at  the  57.  level  of 
significance  for  any  of  the  24  sets  of  replications.  This  exponential 
distribution  indicates  that  after  20  time  steps,  the  probability  of 
losing  track  is  independent  of  track  lifetime,  ie  after  an  initial 
transient  the  filters  reach  steady  state  conditions.  The  value 
t  i  =  20  was  chosen  by  examining  the  transient  behaviour  of  the 
equivalent  Kalman  filter  (see  last  paragraph  of  section  4.3)  and  by 
inspection  of  the  simulation  results.  The  distribution  parameter  a 
may  be  interpreted  as  the  average  number  of  time  steps  that  a  track  will 
survive  in  steady  state  conditions.  Estimates  of  a  are  shown  in 
Fig  4.8.  These  values  are  slightly  greater  than  NAV£  -  20  ,  as  tracks 
surviving  for  less  than  20  time  steps  are  excluded. 

It  is  important  to  establish  the  distribution  of  track  lifetimes 
as  this  allows  one  to  specify  confidence  limits  on  the  estimate  of  a  . 
For  an  exponential  distribution,  the  95%  confidence  limits  are 
approximately : 


where  N  is  the  number  of  replications  used  to  estimate  a  .  These 

limits  define  a  fixed  interval  when  track  lifetime  is  plotted  on  a 

logarithmic  scale.  In  the  performance  estimates  of  the  following 

chapters,  these  confidence  limits  are  shown  with  N  ,  on  the 

assumption  that  track  survival  times  are  also  exponentially  distributed 

in  these  cases  and  that  t  is  small  compared  with  . 

min  K  AVE 
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4.5.3  Computation  time 

Fig  4.9  shows  the  average  cpu  time  for  the  filters  to 

perform  a  single  time  step.  The  time  scale  (which  is  logarithmic)  is 
normalized  to  the  average  cpu  time  for  a  single  PDAF  time  step  which, 
for  the  data  simulated  here,  was  1.12  ms  on  a  Cray  IS  computer.  The' 
computational  effort  is  divided  between  the  propagation  of  mixture 
components  or  tracks  and  mixture  reduction.  For  the  two  filters  with 
the  original  threshold  settings  (Fig  4.9a&b),  T  falls  rapidly  to 
nearly  constant  values  for  >  10  .  Also  for  low  values  of  N  most 
time  is  spent  reducing  the  mixture,  and  as  increases  more  time  is 

required  for  track  propagation  while  the  mixture  reduction  time  decreases. 
This  is  explained  by  Fig  4.2:  the  initial  high  values  of  are  due 

to  time  spent  reducing  large  mixtures  which  result  from  inadequate 
approximations  at  values  of  NT  <  6  .  Except  for  the  case  =  6  ,  the 
JAF  was  more  time  consuming  then  the  CAF,  usually  by  about  50%,  and  as 
expected,  the  execution  times  for  the  filters  were  in  all  cases 
considerably  greater  than  the  PDAF.  However  for  >  10  ,  the  five 
fold  increase  in  execution  time  for  the  CAF  may  well  be  an  acceptable 
price  for  the  pe?formance  improvement  offered  by  this  filter. 

The  time  taren  by  the  JAF  with  T  =  0  is  shown  in  Fig  4.9c. 

This  clearW  shows  the  value  of  the  acceptable  modification  check  in 
the  reduction  algorithms:  for  the  insignificant  improvement  for 

>  10  over  the  filter  with  the  original  threshold  settings,  there 
is  a  large  increase  in  computational  overhead::.  The  extra  processing 
time  is  required  for  the  propagation  and  reduction  of  the  extra  tracks 
generated  when  the  full  components  are  retained  for  N  >  10  , 

( see  Fig  4.2). 
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4.6  Conclusions 

For  the  chosen  simulation  example,  the  JAF  and  CAF  both  give  a 
substantial  performance  improvement  over  the  PDAF.  The  penalty  for 
this  is  the  increased  computational  requirements  of  the  more  complex 
filters.  Minimum  computation  time  and  near  optimal  performance  were 
obtained  when  satisfactory  mixture  approximation  (defined  by  the 
algorithm  thresholds  T  and  T^)  was  achieved  within  the  maximum 
number  of  mixture  components  allowed.  Under  these  conditions  the 

track  survival  times  for  the  JAF  and  CAF  were  identical  on  at  least 
85%  of  the  replications.  This  suggests  that  filter  performance  is  not 
highly  sensitive  to  the  method  of  mixture  reduction,  provided  that  the 
most  important  mixture  components  are  retained.  However,  the  comput¬ 
ation  time  for  the  JAF  was  almost  always  greater  than  that  for  the 
CAF,  usually  by  about  50%.  Thus  in  the  remainder  of  this  study  the 
Clustering  Algorithm  is  always  used  for  mixture  reduction. 
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Fig  4.4  Track  maintenance  times  for  each  replication  =  4 
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3  fry.-*-  jTir.M  STUDIES  OF  THE  CAT  AND  THE  PDAF 

5. 1  Introduction 

The  performance  of  the  tracking  filters  for  the  problem 

described  in  section  4.2  depends  on  only  three  problem  parameters. 

If  the  probability  of  detecting  the  true  measurement  is  unity,  the  two 

remaining  parameters  are  pr  ,  the  normalized  density  of  false  measure- 
4 

ments,  and  qAt  /r  ,  the  normalized  acceleration  variance  of  the  target. 

The  primary  aim  of  this  chapter  is  to  examine  the  performance  of  the 

CAF  as  a  function  of  rhese  two  parameters.  The  pcrl^^^a..^e  mc^Sm-c  la 

the  average  track  survival  time  N,,„,  ,  and  the  baseline  for  the 

AVE 

assessment  is  the  performance  of  the  PDAF.  We  shall  attempt  to 
identify  the  region  of  the  parameter  space  where  the  more  complex  CAF 
gives  a  significant  performance  improvement  over  the  PDAF.  In  the  light 
of  the  simulation  example  of  the  previous  chapter,  the  maximum  number 
of  components  that  may  be  retained  by  the  Clustering  Algorithm  has 
been  set  at  20.  It  is  hoped  that  these  simulation  results  will  provide 
an  assessment  and  design  aid  for  this  type  of  tracking  problem. 

In  the  second  part  of  this  chapter  (section  5.3),  a  single  run  of 
the  tracking  filters  is  examined  in  detail.  The  purpose  of  this 
demonst  ..  on  is  to  give  a  physical  insight  into  how  the  Bayeoidu 
filter  e^.  ate  is  produced.  The  example  is  of  a  situation  where  track 
loss  may  be  avoided  by  the  retention  of  more  than  one  mixture  component 
(using  the  Clustering  Algorithm). 
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5 . 2  The  performance  of  the  CAF  over  the  problem  parameter  space 
5.2.1  Presentation  of  results 

The  average  track  survival  time  NAVE  for  the  CAF  and  the  PD.AF 

4  -4 

is  shown  as  a  function  of  pr  in  Figs  5*1  to  5.5  for  qAt  /r  =  10  , 

-2  2  4 

10  ,  1,  10  and  10  respectively.  As  in  Chapter  4  the  initial  state 

vector  was  known  perfectly,  and  the  filters  were  run  until  the  track 

loss  criteria  of  section  4.4  were  satisfied.  is  the  average  of 

AVE 

100  replications  and  957.  confidence  limits  are  shown  with  each  point 
(assuming  track  lifetime  is  exponentially  distributed) .  Also  shown 
is  the  average  track  lifetime  for  a  constant  velocity  prediction 

on  the  basis  of  the  perfectly  known  initial  state  vector.  For  this 
prediction  measurements  are  ignored;  so  that  the  average  track  lifetime 
N  of  the  prediction  estimate  is  independent  of  pr  .  N  should 

Li  Li 

provide  a  lower  limit  on  filter  performance  which  may  be  approached 
as  the  relative  density  pr  of  false  measurements  becomes  large.  Note 
that  increases  as  qAt^/r  decreases,  ie  as  the  normalized  level  of 

target  manoeuvre  decreases.  The  average  number  of  mixture  components 
before  and  after  approximation  is  also  shown  in  Figs  5.1  to  5.5.  The 
average  cpu  time  required  to  perform  a  single  time  step  for  the  CAF  and 
the  PDAF  is  recorded  in  Table  5.1.  For  each  pair  of  problem  parameters, 
this  table  also  indicates  whether  all  replications  were  halted  by  just 
one  of  the  two  track  loss  criteria. 

The  parameter  pr  is  the  density  of  false  measurements  relative 
to  the  true  measurement  error  variance.  However  the  difficulty  of  the 
tracking  problem  is  likely  to  depend  on  the  density  of  false  measure¬ 
ments  from  the  'point  of  view'  of  the  filter.  Consider  a  single 
feasible  track  corresponding  to  a  mixture  component  of  the  state  pdf. 
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For  given  pr  ,  the  number  of  false  measurements  that  are  plausible 

4  , 

candidates  for  updating  this  track  increases  with  qAt  / r  ,  is  as  the 
variance  of  target  manoeuverability  relative  to  r  increases.  On  this 
basis  a  more  appropriate  measure  of  problem  difficulty  may  be  the 
average  number  of  false  measurements  passed  by  a  filter  acceptance  test, 
(see  section  3.2).  It  is  convenient  to  use  the  acceptance  region  based 
on  the  equivalent  steady  state  Kalman  filter  problem,  as  this  is 
independent  of  the  values  of  individual  measurements.  The  area  of 

this  acceptance  region  is  given  by: 


7rr  T. 


1  -  a 


where  a  is  the  steady  state  value  of  the  position  Kalman  gain  and 

Ta  =  13.82  is  the  acceptance  threshold  corresponding  to  a  99.9% 

chance  of  accepting  the  true  measurement.  It  can  be  shown  (see 
49 

Bridgewater  for  example)  that  a  is  given  by: 


a 


where 


and 


qAt 

r 


Thus  the  average  number  n^  of  false  measurements  passed  by  this 
acceptance  test  is: 
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n 

CO 


which  for  given  depends  only  on  pr  and  qAt  /r  .  In  Figs  5.1 

to  5.5,  the  corresponding  value  of  n  is  given  with  or  for  each  of 

oo 

the  results  shown. 

5.2.2  Discussion  of  results 

The  filters  show  similar  performance  trends  in  each  of  Figs  5.1 

4 

to  5.5.  As  would  be  expected,  for  given  qAt  /r  ,  track  survival  time 
increases  as  pr  and  n  decrease.  Also  the  track  survival  time  of 

oo 

the  CAF  approaches  that  of  the  PDAF  for  both  small  and  large  values  of 
pr.  (This  convergence  for  small  pr  is  not  shown  in  Figs  5.1  and  5.2 
as  track  survival  time  is  so  long  in  these  cases,  that  the  computation 
time  for  the  simulation  would  be  prohibitive.)  Between  these  extremes, 
the  CAF  outperforms  the  PDAF.  The  average  track  lifetime  of  the  CAF 
exceeds  that  of  the  PDAF  by  a  factor  of  10  in  some  cases,  although  an 
improvement  factor  between  3  and  5  is  more  common.  The  region  of  the 
pr  ,  qAt^/r  space  where  the  CAF  gives  a  significant  improvement  over 
the  PDAF  is  sketched  in  Fig  5.6a.  Although  this  diagram  is  only 
approximate,  the  region  clearly  depends  on  qAt^/r  .  In  Fig  5.6b 

the  region  of  improvement  is  sketched  for  the  parameter  space  n^  , 

4  4 

qAt  /r  .  In  this  space  the  dependency  with  qAt  /r  is  not  so  strong, 

but  is  still  quite  evident.  So  performance  of  the  CAF  with  respect  to 

the  PDAF  is  not  solely  determined  by  n 

*  *00 

As  the  filters'  performance  deteriorates  for  increasing  pr  ,  so 
the  average  number  of  mixture  components  before  approximation  increases. 
This  is  the  response  of  the  filters  to  the  increasing  difficulty  of  the 
tracking  problem.  Eventually  the  relative  density  pr  of  false 
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measurements  becomes  so  great  that  the  received  measurements  are  of 
very  little  use  to  either  filter;  N  approaches  N  ,  the  average 
track  lifetime  for  a  simple  prediction.  In  these  circumstances  the 
filters  generate  a  large  number  of  mixture  components  (often  averaging 
over  100  before  approximation)  and  consequently  the  average  computation 
time  for  a  single  filter  time  step  becomes  very  large,  particularly  for 
the  CAF  (see  Table  5.1).  It  is  quite  possible  that  in  these  cases, 
performance  of  the  CAF  is  being  limited  by  (compare  with 

section  4.5.1).  Table  5.1  also  shows  that  for  large  pr  every  track  is 
lost  to  the  excessive  error  check.  Criterion  (ii)  (see  section  4.4). 

As  pr  is  reduced,  the  average  number  of  mixture  components 

retained  by  the  Clustering  Algorithm  decreases  towards  the  lower  limit 

of  a  single  Gaussian.  Thus  the  CAF  approximation  approaches  that  of 

the  PDAF ,  which  explains  the  convergence  of  NAVE  for  the  two  filters. 

Note,  however,  that  in  several  cases  where  the  average  number  of  mixture 

components  after  reduction  for  the  CAF  is  only  fractionally  above  unity, 

the  average  track  lifetime  for  the  CAF  is  about  three  times  that  of  the 

PDAF.  Also,  in  these  cases,  Table  5.1  shows  that  the  average  CAF 

computation  time  per  filter  iteration  is  only  about  twice  that  of  the 

PDAF.  The  convergence  of  N  ^  for  the  CAF  and  PDAF  with  decreasing 

pr  can  be  clearly  seen  in  Figs  5.3  to  5.5.  The  same  effect  may  be 
4-2-4 

expected  for  qAt  /r  =  10  and  10  ,  but  as  already  explained  the 

computation  time  for  the  necessary  simulations  is  prohibitive.  As  pr 
decreases,  the  average  number  of  components  before  approximation  tends 
to  2  for  both  filters.  One  of  these  components  corresponds  to  an 
accepted  measurement  (nearly  always  the  true  measurement) ,  while  the 
other  corresponds  to  the  prediction  which  allows  for  the  possibility 
that  the  true  measurement  has  been  rejected. 
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3 . 3  An  example  of  filter  operation 

In  this  section,  to  gain  an  insight  into  the  operation  and 
performance  of  the  CAT  and  the  PDAF,  a  single  run  of  the  tracking 
filters  is  examined  in  detail.  The  chosen  example  has  the  following 
parame  ters : 


P 


D 


1 


4 

S£L_  =  - 

r 

and 

pr  =  0.005  , 

which  gives  n^  =  0.8683  .  Also  the  maximum  number  of  components  allowed 
after  approximation  by  the  Clustering  Algorithm  was  set  at  =  10  . 
These  parameters  determine  filter  performance.  To  generate  an 
interesting  target  trajectory  the  initial  speed  was  chosen  to  be 
uQ  =  iO/ryat  ,  initial  target  heading  waa  chosen  randomly  and  the 
initial  position  was  the  origin.  Fifty  time  steps  of  tracking  have 
been  simulated. 

5.3.1  Filter  tracking  performance 

For  this  example,  the  target  position  at  each  time  step  is  shown 
in  Fig  5.7,  together  with  the  tracks  or  position  estimates  (ie  the  mean 
of  the  pdf  of  target  position)  of  the  CAF  and  the  PDAF.  The  true 
measurement  generated  at  each  time  step  is  also  shown,  although  the 
false  measurements  have  not  been  plotted.  The  units  of  the  X  and  Y 
axes  are  normalized  with  respect  to  u^dt  >  and  the  scale  of  the  Y 
axis  is  slightly  stretched. 


The  first  few  estimates  of  the  filters  are  very  accurate  since 
the  initial  target  state  vector  is  given.  The  CAF  position  estimate 
follows  the  target  quite  well  throughout  and  the  most  noticeable  errors 
occur  at  target  manoeuvres.  The  tracks  of  the  two  filters  are  very 
similar  up  to  about  time  step  17,  at  which  point  the  PDAF  estimate 
diverges  from  the  trajectory.  The  PDAF  apparently  regains  track 
(probably  fortuitously)  at  time  step  24,  but  fails  to  follow  the 
subsequent  sharp  target  maneouvre  and  soon  finally  diverges  from  the 
target  trajectory.  The  point  at  which  the  PDAF  track  fulfils  the 
second  track  loss  criterion  of  section  4.4  is  shown  on  the  diagram. 

As  expected  from  the  track  plot,  the  CAF  estimate  does  not  alert 
either  of  the  track  loss  criteria. 

To  provide  a  precise  record  of  the  tracking  error  history,  plots 
of  the  estimation  error  in  position  and  velocity  are  shown  in 
Figs  5.8  and  5.9  for  the  CAF  and  PDAF  respectively.  The  magnitude  of 
the  actual  position  error  at  time  step  k  is  calculated  from: 

- \f  *  (\  - 

where  (x^,  5^)  is  the  estimate  of  the  target  position  at  time  step  k 
and  (x,  ,  y  )  is  the  actual  target  position  af  time  step  k  .  The 

K.  K 

calculation  of  the  velocity  error  is  similar.  In  addition  to  the 
actual  error,  an  indication  of  the  filter's  own  view  of  its  estimation 
error  is  shown  as  a  dasheu  line.  The  measure  of  error  (denoted  the 
predicted  error  in  Figs  5.8  and  5.9)  is  derived  from  the  overall 
covariance  matrix  of  target  state,  and  at  time  step  k  it  is  given  by 
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where  P  and  P  are  the  diagonal  elements  of  the  covariance 
*k  yk 

matrix  corresponding  to  target  position.  The  measure  of  the  velocity 
error  is  similar.  Also  shown  for  reference  is  the  error  measure  (obtained 
from  the  covariance  matrix,  as  above)  for  the  equivalent  Kalman  filter, 
ze  for  the  optimal  filter  in  the  absence  of  false  measurements.  Note  that 
the  square  of  the  predicted  error  measure  for  the  filters  is  the  expected 
value  of  the  actual  error  magnitude  squared. 

While  the  Kalman  filter  predicted  error  measure  rabidly  reaches  a 
steady  state,  the  predicted  errors  for  the  PDAF  and  the  CAF  vary  through¬ 
out  the  track  and  are  always  greater  than  or  equal  to  the  Kalman  filter 
reference.  This  is  because  the  covariances  for  the  PDAF  and  the  CAF,  which 
must  operate  with  uncertain  measurement  association,  depend  upon  the  values 
of  the  received  measurements.  However,  the  covariance  of  the  Kalman 
filter,  which  assumes  that  only  true  measurements  are  received,  is 
independent  of  any  measurement  values.  The  predicted  error  measures 
of  the  PDAF  and  the  CAF  cannot  be  better  than  that  of  the  Kalman  filter 
since  the  latter  is  not  corrupted  by  false  measurements  (see  Ref  11). 

The  actual  estimation  errors  of  the  CAF  (Fig  5.8)  show  large 
fluctuations,  but  there  is  no  trend  of  increasing  error  through  the 
track.  There  are  clear  peaks  in  the  position  and  velocity  tracking 
errors  at  time  step  25,  when  the  target  executed  a  sharp  turn.  At  each 
of  these  maxima,  the  CAF ' s  predicted  error  measure  also  peaks  and  closely 
matches  the  actual  error.  Throughout  the  track,  the  CAF  predicted  error 
is  of  the  same  order  as  the  actual  error  and  on  several  occassions 
significant  peaks  coincide  or  are  very  close.  Clearly,  through 
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statistical  fluctuation,  a  perfect  match  over  the  whole  track  is  not 
expected . 

For  the  PDAF,  a  sharp  rise  in  position  error  following  track 
loss  is  clearly  shown  in  Fig  5.9.  The  filter's  predicted  position 
error  also  rises,  but  is  much  smaller  than  the  actual  error  by 
the  end  of  the  track. 

5.3.2  Filter  operation 

Tne  CAF  estimate  of  target  state  at  a  given  time  step  is  the  mean 
of  a  Gaussian  mixture  distribution,  each  component  of  which  corresponds 
to  a  feasible  target  track.  As  explained  in  Chapter  2,  if  several 
measurements  passed  by  the  course  acceptance  test,  these  tracks  sub¬ 
divide  so  producing  a  tree  like  pattern  of  potential  tracks  which  are 
controlled  by  the  Clustering  Algorithm.  The  growth  of  potential  tracks 
for  the  current  example  is  illustrated  in  Fig  5.10.  The  overall  CAF 
estimate  is  shown  as  a  dashed  line,  the  PDAF  estimate  is  shown  as  a 
continuous  black  line  and  the  actual  course  of  the  target  is  shown  by 
small  circles.  The  potential  tracks,  after  the  clustering  operation, 
are  shown  as  coloured  lines,  the  colour  of  the  line  indicating  the 
weighting  of  the  track  {is  the  probability  that  this  is  the  correct 
track) .  To  show  the  potential  tracks  in  the  vicinity  of  the  target  loo 
(labelled  f  on  the  track  in  Fig  5.10)  more  clearly,  this  part  of  the 
picture  has  been  enlarged  and  slightly  stretched  in  Fig  5.11, 
approximately  by  a  factor  of  6. 

The  number  of  potential  tracks  varies  considerably  over  the 
history  of  the  track.  It  appears  that  the  number  of  tracks  increases 


when  the  target  executes  a  manoeuvre  (see  points  c  and  f  on  the 
target  trajectory  in  Fig  5.10).  This  is  because  the  target  model  gives 
the  expected  advance  of  the  target  as  a  straight  line,  and  so  tentative 
tracks  into  false  measurements  are  produced.  These  extra  tracks  are 
eliminated  when  a  steady  course  has  been  resumed  (points  b,  d 
and  g  in  Fig  5.10),  showing  that  the  Clustering  Algorithm  is  economical 
in  its  management  of  potential  tracks. 

Throughout  most  of  the  track  history,  at  least  one  of  the  potential 
target  tracks  closely  follows  the  path  of  the  target,  and  so  has  probably 
correctly  selected  the  true  measurements .  Also  note  that  when  a  tentative 
track  with  6  weight  above  0.5  is  produced  (green  line),  this  track  is 
almost  always  close  to  the  actual  target  path.  At  times  when  the  filter 
appears  to  have  difficulty  in  maintaining  track,  usually  no  potential 
track  with  a  large  6  weight  is  produced  (see  Fig  5.11). 

At  point  c  and  in  the  vicinity  of  point  f  on  the  target  path 
(Fig  5.10),  the  PDAF  estimate  diverges  from  the  actual  trajectory.  At 
these  points,  the  Clustering  Algorithm  has  allowed  the  growth  of 
diverging  potential  tracks,  each  with  a  significant  6  weighting. 

Fig  5.12  shows  contours  of  the  approximated  position  pdfs  of  both  the  CAF 
and  the  PDAF  at  the  17th  time  step  (the  point  after  label  c  in 
Fig  5.10).  The  actual  position  of  the  target  is  also  marked  and  it  is 
clearly  associated  with  the  dominant  cluster  component,  which  accoun.s 
for  85%  of  the  total  probability  mass  of  the  mixture.  The  second  most 
important  cluster  component  has  a  B  weight  of  0.12.  The  PDAF  single 
Gaussian  approximation  appears  to  be  stretched  between  these  two  major 
components.  The  FDAF  approximation  is  the  result  of  a  separate  track 
propagation  and  approximation  sequence,  although  up  to  this 
time,  the  PDAF  and  CAF  tracks  are  similar  and  so  it  is  likely  that 
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the  PDAF  generates  a  fairly  similar  posterior  pdf  before  approximation 
at  this  time  step. 

To  show  how  the  position  pdfs  evolve,  contours  for  the  following 
two  time  steps  (18  and  19)  are  shown  in  Figs  5.13  and  5.14.  At  time 
step  18  (Fig  5.13),  after  clustering  there  are  now  only  two  components. 
The  weak  component  of  the  previous  time  step  has  been  eliminated. 

The  single  Gaussian  of  the  PDAF  has  been  further  stretched  and  flattened 
so  that  its  centre  still  lies  between  the  two  cluster  components, 
but  has  moved  further  away  from  the  dominant  component.  At  time  step  19 
(Fig  5.14)  only  a  single  Cluster  Algorithm  component  is  retained,  which 
is  sharply  concentrated  on  the  target  path.  The  PDAF  approximation  is 
now  well  removed  from  the  true  path  but  still  retains  the  elongated 
form  as  a  legacy  of  time  step  17,  but  which  is  no  longer  relevant.  This 
illustration  shows  the  importance  of  retaining  more  than  one  component 
at  critical  times  during  the  tracking  operation. 

The  situation  six  time  steps  later  (time  step  25)  is  shown  in 
Fig  5.15.  This  is  close  to  the  label  f  in  Fig  5.10  and  here  the 
is  propagating  two  main  clumps  composed  of  eight  components.  The  PDA1 
has  recovered  from  its  poor  pdf  approximation  at  time  step  19 
(possibly  through  a  fortuitous  absence  of  false  measurements  in  the 
track  vicinity)  and  again  straddles  the  CAF  mixture  pdf.  However,  as 
can  be  seen  in  Fig  5.10,  subsequently  the  PDAF  tails  to  follow  the 
target  manoeuvre  and  the  track  is  lost  for  good.  The  single  Gaussian 
approximation  cannot  cope  with  two  diverging  branches,  each  with 
significant  weighting. 

Figs  5.16  and  5.17  show  how  the  number  of  components  of  the 
mixture  distribution  varies,  and  also  the  values  of  the  most  significant 
S  weights  at  each  time  step  for  the  CAF  and  the  PDAF.  For  the  CAF,  the 
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values  of  the  five  largest  8  weightings  after  clustering  are  shown  as 
five  time  traces;  whereas  for  the  PDAF,  the  five  largest  8  weightings 
before  approximation  are  shown.  At  each  time  step  the  8  weights  have 
been  ordered  in  decreasing  magnitude,  so  the  BETA1  trace  always  shows 
the  largest  value.  Together  with  these  traces,  the  number  of  mixture 
components  before  and  after  approximation  has  also  been  plotted. 

Throughout  most  of  the  track, the  Clustering  Algorithm  (Fig  5.16) 
keeps  the  number  of  components  after  approximation  well  below  the 
allowed  limit  of  =  10  .  Comparing  Fig  5.8  with  Fig  5.16,  it  can 
be  seen  that  when  the  filter  is  tracking  well,  the  number  of  components 
is  kept  low.  Only  when  tracking  becomes  difficult,  such  as  during  the 
target  loop  in  this  example,  does  the  number  of  components  rise  and 
significant  8  weighting  extend  to  more  than  three  components  after 
clustering.  In  Fig  5.17  it  can  be  seen  that  as  the  PDAF  became  lost, 
the  number  of  components  before  approximation  rose  greatly  and  eventually 
reached  a  maximum  of  1070.  This  increase  is  due  to  the  expansion  of 
the  filter's  acceptance  region,  which  accompanies  an  increase  in  the 
tracking  error  as  perceived  by  the  filter.  Before  track  loss,  the 
8  weighting  traces  show  that  there  are  usually  only  two  or  three 
significant  components,  one  of  which  is  usually  clearly  dominant.  As 
the  PDAF  begins  to  lose  track,  the  dominance  of  any  one  component 
declines,  and  the  BETA4  and  BETA5  traces  show  a  temporary  increase.  As 
the  number  of  components  rises  well  above  five,  all  the  BETA1  to  BETA5 
traces  fall  towards  zero  as  the  weighting  is  shared  amongest  many 
components.  This  ind icates that  at  each  time  step  the  filter  has 
generated  many  hypotheses,  each  of  which  has  a  very  small  probability 
of  being  correct. 


Finally,  as  an  illustration  of  how  the  CAF  responds  to  losing 
track,  the  potential  tracks  produced  by  a  different  example  are  shown 
in  Fig  5.18.  The  parameters  of  this  example  are  the  same  as  the 
previous  case,  except  the  density  of  false  measurements  has  been 
doubled.  When  the  target  manoeuvres,  the  filter's  tracks  split 
into  two  diverging  branches,  one  of  which  continues  on  the  original 
target  heading  while  the  other  follows  the  target  manoeuvre.  However 
this  latter  branch  eventually  dies  out.  This  is  probably  due  to  the 
true  measurements  having  a  similar,  unusually  large  error  on  several 
consecutive  time  steps,  while  by  chance  false  measurements  fell  close 
to  the  predicted  target  positions  on  the  other  branch.  Note  that  after 
loss  of  track  there  is  a  tendency  to  produce  diverging  tracks  with 
small  S  weights,  and  tracks  with  6  weights  above  0.5  are  only 
produced  on  two  time  steps  out  of  twenty-three. 

5 .4  Discussion  and  conclusions 

In  section  5.2,  the  performance  of  the  CAF  has  been  compared  with 
that  of  the  PDAF  for  the  standard  example  of  the  baseline  problem  (second 
order  target  model  with  true  and  false  Cartesian  position  measurements) . 

The  results  presented  in  Figs  5.1  to  5.6  should  enable  one  to  obtain  an 
initial  assessment  of  filter  performance  for  a  variety  of  two-dimensional 
tracking  problems.  Even  if  the  required  problem  is  not  of  exactly  the  same 
form  as  the  standard  baseline  case,  it  may  be  possible  to  derive  a  rough 
correspondence  so  that  approximate  values  for  the  equivalent  baseline  problem 
parameters  may  be  found.  An  indication  of  the  average  track  lifetimes 
for  the  CAF  and  the  PDAF  with  the  required  parameter  values  may  be 
obtained  by  interpolation  or  extrapolation  from  the  presented  results. 

This  should  show  whether  the  performance  of  the  PDAF  is  likely  to  be 
adequate  for  the  application,  and  if  not,  whether  the  CAF  can  provide 
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the  necessary  improvement.  Clearly  a  detailed  simulation  should  be 
carried  out  to  confirm  this  initial  assessment  before  any  implentation 
is  attempted. 

The  values  of  the  average  track  lifetime  given  in  section  5.2 
depend  on  the  definition  of  track  loss  (see  section  4.4).  A  track  is 
counted  as  lost  if,  over  five  consecutive  time  steps,  either 

(i)  the  true  measurement  is  rejected, 
or  (ii)  the  tracking  error  is  'large'. 

These  criteria  may  not  be  appropriate  for  all  applications.  For 
instance  in  Ref  11  track  loss  is  only  based  on  consistent  rejection 
of  the  true  measurement,  and  it  is  independent  of  tracking  error 
(criterion  (ii)).  Under  this  reduced  definition  of  track  loss,  the 
average  track  lifetime  would  be  much  greater  than  that  shown  in  our 
results.  This  is  especially  so  for  the  higher  values  of  pr  ,  as  in 
these  cases  track  loss  for  all  replications  was  due  to  criterion  (ii) 
(see  Table  5.1). 
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Table  5.1 

PROCESSOR  TIMINGS  AND  TRACK  LOSS  CRITERION 
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Fig  5. 7  Tracking  history  of  Clustering  Algorithm  filter  and  PDAF 
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Fig  5.12  pdf  of  target  position  after  approximation  at  the  17th  time  step 
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Fig  5.13  pdf  of  target  position  after  approximation  at  the  18th  time  step 
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Fig  5.14  pdf  of  target  position  after  approximation  at  the  19th  time  step 
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Fig  5.15  pdf  of  target  position  after  approximation  at  the  25th  time  step 
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THE  SEGTOR  SCAN  PROBLEM 


6 • 1  Introduction 

The  tracking  problem  of  section  4.2  has  been  constructed  so  that 
performance  depends  on  only  a  small  number  of  non-dimensional  parameters. 
This  facilitates  the  assessment  of  filter  performance  over  a  wide 
variety  tracking  conditions  (section  5.2).  However  this  problem  is 
somewhat  unrealistic,  principally  because  practical  sensors,  such  as 
radars,  usually  produce  measurements  in  polar  co-ordinates  rather  than 
Cartesians.  To  show  how  this  complication  can  be  managed,  a  'sector 
scan  problem'  has  been  devised.  This  example  also  serves  to  show  how 
the  assessment  of  section  5.2  can  be  used  to  give  a  rough  indication  of 
filter  performance  for  a  different  tracking  example. 

The  sector  scan  problem  is  to  track  a  target  passing  through  a 
surveillance  sector  in  the  presence  of  false  measurements.  A  sensor 
at  the  origin  produces  position  measurements  in  range  and  bearing,  and 
false  measurements  are  uniformly  distributed  in  polar  co-ordinates. 

On  entering  the  sector,  an  initial  estimate  of  the  target  position  and 
velocity  is  supplied  to  the  filter.  (Note  that  the  question  of 
automatic  track  initiation  is  not  considered  in  this  study  (see 
conference  proceedings  of  Ref  49).)  Since  the  target  could  enter  the 
surveillance  section  from  any  direction,  it  is  convenient  to  employ 
Cartesian  state  variables  which  allow  the  target  kinematics  to  be 
represented  by  a  linear  model,  in  this  case  the  usual  second  order 
model.  This  introduces  a  non-linear  relationship  between  the  state 
vector  and  the  measurements,  which  complicates  the  filtering  problem. 

The  sector  scan  problem  is  also  used  to  investigate  the  effect 
on  performance  of  target  trajectories  which  are  mismatched  to  the  filter 
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model.  This  is  another  practical  difficulty  that  must  be  considered  in 
filter  design.  Two  different  types  of  mismatched  trajectory  have  been 
examined : 

(i)  Trajectories  simulated  using  the  second  order  model  with  a 
value  for  the  acceleration  variance  q  which  is  different  from 
that  assumed  by  the  filter. 

(ii)  Deterministic  target  paths  consisting  of  periods  of  constant 
velocity  motion  and  deliberate  manoeuvres. 

6 . 2  Problem  description  and  solution 

The  surveillance  sector  is  defined  as  the  region  where  X  >  0  km  , 

Y  >  0  km  and: 


2  km  <  Jx2  +  Y2  <  20  km  . 

Every  second,  this  region  is  scanned  by  a  single  sensor  located  at  the 

origin,  and  a  set  of  position  measurements  is  passed  to  the  tracking 

filter.  These  measurements  are  in  polar  co-ordinates.  The  probability 

of  detecting  a  target  that  is  within  the  sector  is  PQ  ,  and  the  range 

and  bearing  errors  on  the  true  measurement  are  independent  and  Gaussian 

2 

with  zero  mean.  The  variance  of  the  range  error  is  and  the 

2 

variance  of  the  bearing  error  is  .  False  measurements  are  uni¬ 
formly  distributed  in  polar  co-ordinates.  Thus  the  density  of  false 
measurements  per  unit  area  decreases  with  distance  from  the  origin  (see 
next  section).  Only  one  target  is  present  within  the  sector. 

When  a  target  enters  the  surveillance  region,  the  tracking  filter 
is  initialized  with  an  estimate  of  target  position  and  velocity.  This 
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initial  estimate  has  a  Gaussian  error  of  known  covariance.  The  success 
of  the  filter  in  tracking  the  target  is  assessed  by  examining  the 
position  tracking  error  as  the  target  is  leaving  the  sector.  Track  is 
said  to  be  maintained  if: 


and 


i  £  -  x  l  <  10  o 


\9  -  y|  <  ° 


(6.1) 


where  (x,  y)  are  the  co-ordinates  of  the  target  on  the  last  sensor  scan 
before  the  target  leaves  the  sector,  (x,  y)  is  the  corresponding  filter 
estimate,  and  and  are  the  standard  deviations  of  the  equivalent 

Kalman  filter  estimate  (see  later).  This  definition  of  track  loss  is 
derived  from  criterian  (ii)  of  section  4.4.  Clearly  the  tracking  filter 
is  not  penalized  for  poor  performance  within  the  sector,  but  in  practice 
it  has  been  found  that  if  the  track  deviates  significantly  from  the 
target  path,  the  filter  is  unlikely  to  regain  track  before  the  target 
leaves  the  sector. 

The  tracking  filters  which  have  been  applied  to  this  problem 
employ  the  usual  second  order  target  model  (equation  (4.1))  expressed 
in  Cartesian  co-ordinates ,  as  this  avoids  the  need  for  a  non-linear 
model  written  in  polar  co-ordinates.  However  this  does  introduce  a 
non-linearity  between  the  true  measurement  and  the  target  state  vector. 
Thus  equation  (2.2)  for  the  baseline  problem  statement  should  be 
replaced  by: 


z 


h(x)  +  v 


(6  .2) 
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where  z  is  the  true  measurement,  x  is  the  state  vector  and  v  is 
the  Gaussian  measurement  noise  at  some  time  step.  For  the  present 
example : 


z  = 


/*"  ' 


M 


\  9 


M 


/ 


/  n  2  \ 

I  /  -  +  y  l 


h(x) 


li 

/  x 


-1 


tan  (y/x)  j 


and  the  covariance  of  v  is: 


(6.3) 


/  2  _ 

/a  0 

r 

\  °  °9  I  ' 

The  use  of  r  to  denote  range  should  not  cause  any  confusion  with  the 
measurement  noise  variance  of  the  previous  example. 

If  it  is  given  that  z  is  the  true  measurement  and  we  attempt  to 

apply  the  Bayesian  techniques  of  Chapter  2  to  this  problem,  the 

posterior  pdf  of  x  after  updating  with  z  will  be  non-Gaussian  due 

to  the  non-linear  element  h(x)  .  As  the  optimal  Bayesian  filter  for 

this  problem  cannot  be  written  in  a  simple  recursive  form,  the  sub- 

.27 

optimal  extended  Kalman  filter  (see  Jazwinski  )  has  been  employed. 


This  filter  is  derived  by  linearizing  about  the  state  vector  prediction 
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at  each  time  step  and  then  applying  the  standard  Kalman  filter 
relations.  Thus  at  a  given  time  step: 

h(x)  =  h(x)  +  V  h(x)  (x  -  x)  +  higher  order  terms  ,  (6.4) 


r  -i  /  ^^i 

where  |v  h(x)  = 


is  the  Jacobian  matrix  , 


h.  is  the  ith  element  of  h(x) 

l  -  - 

and  Xj  is  the  jth  element  of  x  .  For  the  present  example, 
from  equation  (6.3): 


^  V  h(x)j 


cos  6  0  sin  6  0 


sin  8  o  cos  6  ^ 


\  r 


(6.5) 


.  -  p2  -2 

where  r  =dx  +  y 

and  9  =  tan  Vy/x)  .  To  derive  the  extended  Kalman  filter,  the 
higher  order  terms  in  the  Taylor  expansion  (6.4)  are  ignored. 

It  can  be  shown  that  the  resulting  filter  recursions  are  the  same  as 
those  of  the  standard  Kalman  filter  (equations  (2.8)),  but  that  the 
innovation  vector  (z  -  Hx)  is  replaced  by  (z  -  h(x))  and  elsewhere  H 
is  replaced  by  h(x)J  .  In  particular  the  covariance  of  the 
innovation  is  given  by: 


s  =  [l  h(^)]M[v  Mx)]T  +  R  . 


(6  .6) 


The  output  of  the  extended  Kalman  filter  may  be  interpreted  as 
the  mean  and  covariance  of  a  Gaussian  approximation  to  the  true 
posterior  distribution.  Thus  when  the  false  measurements  are  present, 
the  extended  Kalman  filter  may  be  used  to  propagate  feasible  tracks 
to  make  up  a  Gaussian  mixture  distribution  for  the  target  state.  To 
evaluate  the  mixture  weights  of  this  distribution,  the  prior  pdf  of  the 
true  measurement  for  each  track  is  required.  This  may  be  approximated 
by  a  Gaussian  in  polar  co-ordinates  with  mean: 

h(i}  =  (f) 


and  covariance  S  given  by  equation  (6.6).  Since  the  false  measurements 
are  uniformly  distributed  in  polar  co-ordinates,  the  mixture  weights 
are  given  by  equation  (2.18)  with  Hx  replaced  by  h(x)  and  S  given 
by  equation  (6.6).  Clearly  it  is  also  convenient  to  carry  out  an 
acceptance  test  in  polar  co-ordinates,  using  this  Gaussian  as  an 
approximation  to  the  prior  pdf  of  the  true  measurement.  The  filter 
may  be  implemented  using  the  PDAF  or  the  Clustering  Algorithm  approxi¬ 
mation  in  the  usual  way. 

6 . 3  Generation  of  target  trajectories  and  measurements 

Trajectories  of  targets  passing  through  the  surveillance  sector 
may  be  generated  either  from  the  second  order  model  as  in  the  previous 
example,  or  deterministic  trajectories  consisting  of  constant  velocity 
paths  interspersed  by  deliberate  manoeuvres  may  be  generated.  If  the 
second  order  model  is  used,  the  variance  q  of  the  random  numbers 
driving  the  model  (the  acceleration  noise)  may  be  chosen  to  be  different 
from  the  model  noise  assumed  by  the  filters.  This  allows  the  effect 
of  parameter  mismatch  to  be  examined.  The  initial  target  heading  on 


entering  the  sector  for  each  simulated  trajectory  is  chosen  at  random 
from  a  uniform  distribution  over  [0,  2tt]  ,  and  the  initial  target 
speed  is  selected  from  a  Gaussian  distribution.  The  initial  position 
is  that  point  on  the  boundary  of  the  sector  for  which  the  initial 
velocity  vector  passes  through  the  centre  of  the  sector.  At  each  time 
step  a  true  measurement  may  be  simulated  and  false  measurements  of 
required  density  are  generated  over  the  complete  sector.  The  simulation 
of  a  trajectory  ends  when  the  target  passes  out  of  the  sector.  Two 
separate  random  number  sequences  are  employed.  One  of  these  is  used 
for  generating  the  target  trajectory  and  the  true  measurements,  while 
the  other  is  used  for  generating  false  measurements.  Thus  the  density 
of  false  measurements  can  be  changed  without  altering  the  trajectories 
or  the  true  measurements. 

For  this  problem  we  shall  not  attempt  to  assess  performance  over 
a  wide  range  of  parameters,  but  the  performance  about  a  principal  set  of 
parameters  will  be  investigated.  For  this  principal  problem,  trajectories 
are  generated  using  the  second  order  model  with  At  =  1  second  and  the 
standard  deviation  /q  of  the  driving  acceleration  noise  chosen  to  be: 

-2 

/q  =  0.05  km  sec  5'g' 

The  initial  target  speed  is  drawn  from  a  Gaussian  with  mean  0.3  km  sec 
and  standard  deviation  0.02  km  sec  Fig  6.1  shows  a  sample  of  eight 
trajectories  generated  with  these  parameters.  For  a  sample  of  100 
trajectories,  on  average  the  target  took  48  seconds  to  pass  through  the 
sector.  True  measurements  produced  by  the  sensor  (a  radar  for  example) 
have  range  errors  of  standard  deviation  =  0.03  km  and  angular 

errors  of  standard  deviation  a  =  0.01745  radians  mr  1  0  .  The  density 

0 
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of  false  measurments  is  p  =  10.0  km  radians  ,  so  that  on  average 

18  n/2  p  =  282.7  false  measurements  per  scan  are  generated.  The 

sector  is  scanned  every  second  and  the  probability  of  detecting  the 

target  is  =  1  .  In  Fig  6.2  the  surveillance  sector  is  divided  into 

54  cells  of  angular  extent  lOa^  and  °f  radial  extent  lOOc^  ,  and  the 

average  (over  100  scans)  number  of  false  measurements  per  scan  falling 

within  each  cell  is  shown.  As  expected  the  sample  mean  fluctuates 

about  1000a  a  p  as  5.236  .  Initial  estimates  of  target  position  and 
r  0 

velocity,  which  are  available  to  the  filters,  are  in  Cartesian 
co-ordinates.  The  standard  deviation  of  the  position  error  is  0.1  km  on 
each  co-ordinate  and  the  standard  deviation  of  the  velocity  error  is 
0.03  km  sec  '  for  each  co-ordinate.  These  principal  problem  parameters 
are  listed  in  Table  6.1. 

No  direct  correspondence  between  the  parameters  of  this  problem 
and  the  assessment  example  of  section  5.2  (with  Cartesian  measurements)  is 
possible.  However  the  number  of  false  measurements  falling  within  a  cell 
defined  by  the  standard  deviation  of  the  true  measurement  error  is 
p  orag  —  0.0054  which  corresponds  to  the  parameter  pr  of  the 
assessment  example.  Also  the  non-dimensional  parameter: 

,  4 

_ qAt _ 

o  a. *  range 
r  0 

is  analagous  to  qAt^/r  of  the  assessment  example.  Hence  taking  the 
standard  range  to  be  11  km,  ze  to  the  centre  of  the  sector,  the 
equivalent  parameters  of  the  assessment  example  are  approximately: 
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pr  =  0.005 


and 


r 


The  closest  data  point  for  which  an  estimate  of  track  lifetime  N 

AVE 

is  available  for  the  assessment  example  is: 


and 


pr  =  0.005 


1 


(see  Fig  5.3).  For  these  parameters: 

N4ITr.  =  78.03  for  the  PDAF 

AVE 

and 

NAVE  =  835 -65  for  the  CAF  • 

Assuming  an  exponential  distribution  for  track  lifetime,  the  probability 
of  a  track  surviving  for  at  least  t  time  steps  is: 

exp(-  t/NAVE)  . 


As  noted,  the  average  time  for  a  target  to  pass  through  the  sector  is 
t  =  48  seconds  ,  therefore  we  can  expect  the  PDAF  to  maintain  track 
on  about  54%  of  targets  and  the  CAF  to  maintain  track  on  about  94.4% 


of  targets. 
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6 .4  Simulation  results 

6.4.1  Correctly  matched  parameters 

The  CAF  and  the  PDAF  were  applied  to  100  replications  of  this 
problem  for  the  standard  parameters  given  above.  For  the  PDAF  72%  of 
the  tracks  were  maintained  while  for  the  CAF  95%  of  the  tracks  were 
maintained.  Thus  the  performance  prediction  of  the  previous  paragraph 
was  very  accurate  for  the  CAF  but  rather  pessimistic  for  the  PDAF.  This 
discrepancy  is  probably  due  to  the  imprecise  correspondence  between  the 
two  problems  and  the  neglect  of  any  initial  transient  behaviour  of  the 
filters.  Fig  6.3  shows  an  example  of  the  CAF  and  the  PDAF  tracking  a 
target  across  the  sector.  In  this  example  the  CAF  successfully 
maintained  track  although  the  PDAF  track  became  lost.  Fig  6.4  shows 
an  example  of  the  extended  Kalman  filter  tracking  in  the  absence  of 
false  measurements.  This  figure  shows  the  true  measurements  produced 
by  the  sensor;  the  increase  in  the  measurement  and  tracking  errors  as 
the  range  from  the  sensor  to  the  target  increases  can  be  clearly  seen. 

Fig  6.5  shows  how  the  tracking  performance  of  the  CAF  and  the  PDAF 
is  affected  by  varying  the  density  p  of  false  measurements  without 
changing  the  target  trajectories  or  the  true  measurements.  Tracking 
performance  is  shown  for  each  of  the  100  replications  for  p  =  5,  10, 

20,  30  and  40  km  1  radians  ^  .  For  each  of  these  values  of  p  two 
traces  are  shown,  one  corresponding  to  the  PDAF  and  the  other 
corresponding  to  the  CAF.  Each  trace  has  two  levels  H  and  L  , 
according  to  whether  a  track  was  held  or  lost  for  each  replication. 

It  can  be  seen  that  for  each  value  of  p  ,  every  track  held  by  the  PDAF 
was  also  held  by  the  CAF.  One  might  expect  that  those  tracks  held  by 
one  of  the  filters  for  large  p  would  also  be  maintained  by  that 
filter  for  smaller  values  of  p  .  However  this  i.  not  always  so, 
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because  the  random  false  measurements  for  a  particular  replication 
change  completely  as  p  is  varied.  Similarly,  some  tracks  lost  for 
small  p  are  held  for  large  p  .  As  p  is  increased,  the  number  of 
tracks  maintained  by  each  filter  decreases. 

This  can  be  seen  more  clearly  in  Fig  6.6  where  the  percentage 
of  tracks  maintained  by  each  filter  is  plotted  against  p  .  95% 

confidence  limites,  derived  from  a  binomial  distribution  since  each 
replication  is  an  independent  Bernoulli  trial,  are  given  with  each 
percentage.  Also  the  average  number  of  components  before  and  after 
reduction  are  shown  for  the  held  tracks.  The  results  exhibit  similar 
trends  as  described  in  section  5.2  for  the  previous  example.  For 
small  p  ,  the  PDAF  and  CAF  both  hold  nearly  all  of  the  tracks,  but 
for  p  >  5  km  1  rad  1  ,  the  CAF  becomes  more  successful  at  maintaining 
track  than  the  PDAF.  The  average  number  of  mixture  components 
generated  increases  with  p  ,  as  does  the  required  processing  time 
recorded  in  Table  6.2  (part  I).  In  this  table  the  average  computation 
time  per  step  is  given  for  held  tracks  and  lost  tracks  separately. 

For  small  p  ,  the  computation  time  for  held  and  lost  tracks  is  similar 
although  for  large  p  the  average  timings  for  lost  tracks  are  much 
greater,  particularly  for  the  CAF.  This  is  due  to  the  proliferation 
of  feasible  tracks  which  occurs  for  large  p  when  the  target  is  lost. 

Also  shown  in  Table  6.2  (part  I)  is  an  indication  of  the  accuracy 
of  the  filters'  own  assessment  of  their  tracking  error  in  both  position 
and  velocity.  This  consistency  measure,  denoted  E  ,  is  derived  as 
follows.  At  each  time  step,  the  quadratic  form: 

(x  -  x) T  P  *  (x  -  x) 


(6.7) 
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is  evaluated,  where  x  is  the  true  value  of  the  state  vector,  x  is 
the  filter's  estimate  of  the  state  vecor  (ie  the  mean  of  the  mixture 
distribution)  and  P  is  the  overall  covariance  matrix  of  the  mixture. 

If  the  filter's  internal  covariance  P  is  compatible  with  the  actual 
tracking  error  x  -  31  ,  then  the  expected  value  of  the  quadratic  form 
(6.7)  is  4,  because  for  this  tracking  problem  the  state  vector' 
is  four-dimensional.  The  statistic  E  given  in  Table  6.2  is  calculated 
by  averaging  (6.7)  over  all  time  steps  for  held  and  lost  tracks 
separately.  Since  the  tracking  error  x  -  x  may  be  correlated  over 
several  time  steps  and  it  may  not  be  a  Guassian  variable,  we  cannot 

expect  the  distribution  of  the  sum  of  (6.7)  over  all  time 

2  -  , 
steps  to  have  a  x  distribution.  However  since  E  is  usually  the 

result  of  an  average  over  many  hundreds  of  time  steps,  if  E  deviates 

from  4  by  as  much  as  one  unit,  it  is  reasonable  to  conclude  that  the 

filter's  internal  covariance  P  is  incompatible  with  the  actual 

tracking  error.  Table  6.2  shows  that  for  both  filters  the  value  of 

E  for  maintained  tracks  is  usually  slightly  less  than  4,  but  within 

10%  of  this  figure.  This  indicates  that  the  achieved  tracking  error 

is  a  little  better  than  the  filters'  assessment,  and  this  is  possibly 

because  the  maintained  tracks  are  a  biased  sample  in  favour  of  the  more 

accurate  tracks.  For  lost  tracks,  E  is  usually  very  much  larger  than  4, 

showing  that  the  filters  seriously  underestimate  the  tracking  error. 

The  CAF  is  worse  than  the  PDAF  in  this  respect. 

The  actual  mean  square  position  tracking  errors  achieved  by  the 
filters  for  the  first  20  time  steps  are  shown  in  Fig  6.7  for  p  =  5,  10 
20  and  40  km  1  rad  1  .  These  results  are  obtained  by  averaging  the 
square  of  the  position  error  at  a  particular  time  step  over  all 
replications.  The  mean  square  error  is  also  shown  for  the  maintained 
tracks  only.  As  a  reference  level  the  tracking  error  for  the 
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extended  Kalman  filter,  which  is  supplied  only  with  true  measurements, 
has  been  plotted.  The  results  shown  in  Fig  6.7  are  intuitively 
reasonable.  The  Kalman  filter  tracking  errors  are  smaller  than  those 
for  the  filters  which  have  to  cope  with  false  measurements,  and  the 
errors  averaged  over  all  tracks  are  greater  than  those  for  held  tracks 
only.  Errors  for  the  CAf  tend  to  be  smaller  than  those  of  the  PDAF , 
although  for  p  =  20  km  ^  rad  ^  the  13  tracks  held  by  the  PDAF  have 
a  smaller  mean  square  position  error  than  the  86  tracks  held  by  the 
CAT.  Thus  it  appears  that  PDAF  tracks  are  only  able  to  survive  in  this 
case  if  they  are  able  to  achieve  a  relatively  small  position  error  in 
the  early  stages  of  the  track  (average  track  length  being  48  time  steps)  . 

6.4.2  Mismatched  model  noise 

One  would  expect  performance  to  degrade  if  the  assumed  values  of 
the  filter  parameters  p,  q,  c^,  c0  and  differ  from  their  correct 

values.  Here  we  examine  the  effect  of  a  mismatch  in  the  parameter  q  , 
the  variance  of  the  model  noise,  which  describes  the  manoeuvrability  of 
the  target.  If  the  values  of  q  assumed  by  the  filters  is  less  than 
the  correct  value,  the  filters  may  judge  actual  target  manoeuvres  to 
be  highly  improbable,  in  which  case  true  measurements  may  be  rejected 
or  given  a  very  low  probability  weighting.  If  the  value  of  q  is 
too  high,  the  filters  may  give  too  much  weighting  to  false  measurements 
which  could  only  be  true  if  the  target  had  performed  a  large  manoeuvre 
incompatible  with  the  correct  value  of  q  .  An  adaptive  version  of  the 
PDAF  which  learns  an  unknown  value  of  q  from  a  set  of  possible 
candidates  has  been  proposed  by  Gauvrit^ .  However  for  the  present 
study  only  the  fixed  parameter  filter  has  been  considered. 

The  100  trajectories  simulated  for  the  standard  problem  parameters 
(see  Table  6.1)  were  used  to  investigate  the  effect  of  supplying  the 
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filters  with  the  incorrect  value  of  q  .  The  CAF  and  the  PDAF  have  been 

_2 

applied  with  /q  set  to  0.01,  0.025,  0.1  and  0.25  km  sec  ,  as  well  as 

-2 

the  correct  value  of  0.05  km  sec  .  The  percentage  of  tracks  maintained 
for  these  values  are  given  in  Fig  6.8  together  with  the  average  number 
of  components  generated  for  the  maintained  tracks.  (The  error  reference 
the  track  loss  criterion  it  obtained  from  *-he  Kalman  filter  using 
the  correct  value  of  /q  .)  It  appears  that  the  CAF  performance  is  less 
sensitive  to  parameter  mismatch  than  the  PDAF.  This  extra  flexibility 
of  the  CAF  is  due  to  the  filter's  ability  to  retain  several  feasible 
tracks.  Indeed  there  is  a  slight  (probably  insignificant)  performance 
improvement  for  the  CAF  when  v'q”  is  doubled,  although  the  percentage 
of  tracks  held  by  the  PDAF  is  reduced  from  72%  to  16%.  When  v'q:  is 
increased  to  five  times  its  correct  value,  the  number  of  tracks  held  by 
the  CAF  is  reduced  by  about  one  third,  although  the  PDAF  now  loses  all 
of  the  tracks.  As  /q  is  decreased  from  its  correct  value,  the 
performance  of  both  filters  degrades  at  a  similar  rate.  Also  note  that 
the  number  of  components  generated  by  the  filter  increases  with  /q  . 
This  is  because  with  increasing  v^  ,  the  filters  believe  the  target  to 
be  capable  of  larger  manoeuvres  and  so  are  more  ready  to  accept  false 
measurements . 

Average  computation  time  per  step  and  the  error  statistic  E 
are  given  in  part  II  of  Table  6.2.  CAF  processing  time  increases  with 
/q  due  to  the  increasing  number  of  components  generated.  The  reasonable 
CAF  track  maintainance  performance  obtained  when  /q  is  five  times  its 
correct  value  is  at  the  expense  of  a  60  fold  increase  in  computation 
time  for  held  tracks.  The  PDAF  incurs  only  a  small  increase  in 
processing  time  although  performance  falls  off  rapidly  for  q  too 
large . 
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The  use  of  an  incorrect  value  for  q  has  a  noticeable  effect  on 
the  error  statistic  E  .  When  q  is  too  large,  E  is  significantly 
less  than  four  for  both  held  and  lost  tracks,  showing  that  the  filters 
are  overestimating  their  tracking  errors,  so  that  the  fiiLcr  pains  are 
set  too  high.  When  q  is  too  small,  E  is  much  greater  than  four 
-howiag  that  the  filters  are  overoptimistic  about  their  tracking 
performance.  In  this  case  the  filter  gains  are  too  small  so  that  the 
fiiLers  are  insufficiently  responsive  to  received  measurements.  The 
actual  mean  square  position  errors  for  maintained  CAF  tracks  with 
mismatched  q  are  shown  in  Fig  6.9  for  the  first  twenty  time  steps. 
After  the  first  few  time  steps,  there  is  a  clear  trend  for  tracking 
error  to  increase  as  the  assumed  value  of  q  deviates  further  from  its 
correct  value.  When  /q  is  five  times  or  one  fifth  of  its  correct 
value,  the  mean  square  position  error  after  the  tenth  time  step  is 
approximately  ten  times  that  obtained  with  the  correct  value  of  q  . 

6.4.3  Trajectories  with  deliberate  manoeuvres 

In  this  section  we  investigate  the  tracking  performance  of  the 
filters  when  the  target  executes  deterministic  manoeuvres  which  do  not 
obey  the  filter  model.  This  is  a  further  degree  of  mismatch  between 
the  assumed  and  the  actual  target  behaviour.  Two  types  of  trajectory 
have  been  simulated,  both  of  which  start  with  a  constant  velocity 
course.  The  initial  position  and  velocity  of  the  target  on  entering 
the  sector  is  chosen  as  described  in  section  6.3.  For  the  first  type 
of  trajectory,  the  target  proceeds  on  the  constant  velocity  court,  for 
12  seconds  after  entering  the  sector,  then  performs  a  sinusoidal  weave 
with  half  amplitude  1  km  and  frequency  0.05  Hz,  and  finally  returns  to 
a  constant  velocity  course  after  35  seconds  of  weaving.  For  this  weave 
the  maximum  target  acceleration  is  about  10'g*  at  the  extremities  of 
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the  sinusoid.  An  example  of  this  type  of  trajectory  is  shown  in 
Fig  6.10.  For  the  second  type  of  trajectory,  after  having  travelled 
in  a  straight  line  for  25  seconds,  the  target  turns  in  a  circular  arc 
for  15  seconds  and  then  resumes  a  constant  velocity  course.  The 
radius  of  the  arc  is  1  km,  so  that  for  the  mean  target  speed,  the 
acceleration  whilst  turning  is  aboi’t  9'g'.  An  example  of  this  tyDe  of 
trajectory  is  shown  in  Fig  6.11. 

For  those  trajectories,  the  motion  of  the  target  switches  between 
periods  of  constant  velocity  motion  and  periods  of  high  'g'  manoeuvres. 

In  these  circumstances,  one  would  ideally  employ  different  target  models 
for  the  two  phases  of  the  trajectory.  For  example,  if  the  second  order 
model  were  used,  q  =  0  would  be  correct  for  constant  velocity  motion  while 
a  value  of  v'q  clo=°  to  the  maximum  acceleration  that  can  be  achieved  by 
the  target  might  be  appropriate  (but  not  ideal)  for  periods  of 
manoeuvre.  Usually  the  filter  does  not  know  when  the  target  is  going 
to  execute  a  manoeuvre  and  so  adaptive  tracking  schemes  have  been 
suggested.  For  instance  the  Interacting  Multiple  Model  (IMM)  algorithm 
of  Blom^’^  assumes  that  the  target  motion  may  be  described  by  one  of 
a  set  of  possible  models,  and  that  the  motion  changes  abruptly  between 
these  models  with  some  assumed  switching  probability.  This  introduces 
a  further  degree  of  uncertainty  into  the  tracking  problem  which  gives 
rise  to  a  large  increase  in  the  number  of  components  making  up  the 
mixture  distribution  of  the  target  state.  HoulSs  and  Bar-Shalom^ 
have  applied  the  IMM  algorithm  with  the  PDAF  to  an  example  which  is 
very  similar  to  the  sector  scan  problem.  However  for  the  present  study 
a  single  target  model  with  fixed  parameters  has  been  employed  to  avoid 
the  added  complication  of  a  multiple  model  filter. 
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One  hundred  replications  of  each  of  the  two  types  of  trajectory 
have  been  generated  together  with  measurements  with  the  standard 
parameters  (see  Table  6.1).  Figs  6.12  and  6.13  show  the  percents ce 
of  these  tracks  maintained  by  the  CAF  and  the  PDAF  for  different  values 
of  the  assumed  model  noise  standard  deviation  /q  .  The  correct 
measurement  parameters  a^,  and  o  were  supplied  to  the  filters 

and  as  usual  the  reference  error  for  the  track  loss  criterion  was 

_2 

obtained  from  the  Kalman  filter  with  /q  =  0.05  km  sec  .  For  both 

types  of  trajectory  the  performance  of  the  PDAF  is  poor,  with  the 

_  -2 

percentage  of  held  tracks  rising  above  10%  only  for  /q  =  0.05  km  sec 

For  the  CAF  the  best  performance  is  achieved  at  the  higher  value  of 

_2 

/q  =  0.1  km  sec  ,  for  which  97%  of  weaving  tracks  and  99%  of  circling 

tracks  were  held.  Note  that  this  value  of  /q  is  close  to  the 
maximum  acceleration  of  the  targets  when  they  are  performing  their 
manoeuvres.  As  in  the  case  of  second  order  model  trajectories  with 
mismatched  q  ,  the  performance  of  the  CAF  appears  to  be  less  sensitive 

than  the  PDAF  to  variation  of  q  ;  reasonable  CAF  perrormance  being 

_2 

obtained  with  /q  =  0.25  and  0.05  km  sec  "  .  It  woulu  be  interesting 
to  see  if  use  of  the  IMM  algorithm  would  improve  the  PDAF  performance. 

As  already  indicated,  a  single  value  of  q  is  a  compromise  for 
this  problem.  This  is  highlighted  in  Figs  6.14  and  6.15  which  show 

the  mean  square  position  error  as  a  function  of  the  time  step  for  vq 

-2 

set  to  0.025,  0.05,  0.1  and  0.25  km  sec  .  For  maintained  tracks,  the 
minimum  error  for  the  initial  constant  velocity  path  is  obtained  for  the 
smallest  value  of  q  ,  although  the  minimum  tracking  error  during  the 
manoeuvre  is  obtained  for  /q  =  0.1  km  sec  .  Generally  for  fixed  q 
the  tracking  error  is  greatest  during  the  target  manoeuvre,  and  this  is 
when  tracks  are  usually  lost,  as  can  be  seen  from  the  traces  showing 
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error  averaged  overall  tracks.  However  for  the  high  value  of 

“2 

/q  =  0.25  km  sec  “  ,  the  error  for  maintained  tracks  is  fairly 

constant  over  the  whole  trajectory  after  the  initial  transient.  For 

_2 

the  weaving  trajectories  with  /q  =  0.1  km  sec  (for  which  the  CAP 
performs  best),  it  can  be  seen  from  Fig  6.14  that  the  largest  CAF 
tracking  errors  occur  just  after  the  turning  points  of  the  weave,  when 
the  target  is  pulling  maximum  'g'.  This  is  also  clear  in  the  tracking 
example  shown  in  Fig  6.10. 

The  average  processing  time  per  step  and  the  error  statistic  E 
for  these  simulations  are  recorded  in  Table  6.3.  These  values  are 
averaged  over  all  time  steps  of  the  trajectories,  including  periods  of 
manoeuvre  and  constant  velocity  motion.  As  for  the  case  of  mismatched 
q  with  second  order  model  trajectories,  the  processing  time  for  the 

CAF  rises  with  q  ,  at  first  gently  and  then  steeply  for  /q  >  0.1  km 

-2 

sec  .  This  is  reflected  in  the  number  of  mixture  components  generated 

(see  Figs  6.12  and  6.13).  PDAF  computation  time  also  rises  with  q  , 

_2 

but  does  not  show  the  sharp  rise  of  the  CAF  for  y'q  >  0.1  km  sec 

_2 

For  /q  .<;  0.1  km  sec  ,  average  CAF  processing  time  is  three  to  four 
times  greater  than  that  of  the  PDAF. 

Since  the  generated  trajectories  do  not  match  the  filter's  target 

model  and  the  level  of  manoeuvre  changes  in  mid-course,  we  cannot 

_  _2 

expect  E  to  be  very  close  to  four.  However  for  /q  =  0.1  km  sec  , 

when  the  CAF  performs  best,  the  values  of  E  for  weaving  and  circular 
manoeuving  targets  are  within  an  order  of  magnitude  of  four,  which 
suggests  that  this  value  of  q  is  a  reasonable  compromise  for  these 
tra j  ec  tor ies . 


15C 


6. 5  Conclusions 

The  sector  scan  problem  presented  in  this  chapter  provides  a  more 
realistic  demonstration  of  the  baseline  problem.  The  extended  Kalman 
filter  has  been  employed  to  manage  the  non-linear  relationship  between 
the  measurements  in  polar  co-ordinates  and  the  target  model  in 
Cartesian  co-ordinates.  Essentially  the  measurement  association  and 
evaluation  of  the  probability  weights  of  the  mixture  pdf  are  performed 
in  polar  co-ordinates,  while  the  calculation  of  the  mean  and 
covariance  of  each  mixture  component  (the  filtering  operation)  is 
performed  in  Cartesian  co-ordinates. 

The  effect  on  filter  performance  of  a  mismatch  between  the  statistics 
of  the  actual  target  trajectory  and  the  assumed  filter  model  has  been 
studied.  For  trajectories  generated  by  the  second  order  model,  CAF 
performance  is  less  sensitive  to  mismatch  than  the  PDAF.  Also  for  the 
deterministic  manoeuvres,  the  CAF  achieves  acceptable  performance  over 
a  wider  range  of  filter  model  parameters  than  the  PDAF.  This  extra 
flexibility  of  the  CAF  is  due  to  the  filter's  ability  to  retain  several 
feasible  tracks.  As  might  be  expected,  statistical  anlaysis  shows  that 
the  filters'  internal  assessment  of  tracking  error  is  unreliable  if  the 
filter  model  is  incorrect.  Filter  assessment  is  optimistic  when  the 
manoeuvre  parameter  q  is  too  small  and  it  is  pessimistic  when  q  is 


too  large. 


Table  6  .  1 


PRINCIPAL  PROBLEM  PARAMETERS  FOR  THE  SECTOR  SCAN  PROBLEM 


Surveillance  sector  is  the  region: 

X  >  0  km  ,  Y  >  0  km 

and 

r 2  i 

2  km  <  yJX  +  Y  <  20  km 

Second  order  target  model: 

Standard  deviation  of  acceleration  noise  for  each  co-ordinate  is: 

_2 

/q  =  0.05  km  sec  —  5 ’ g ' 

Initial  target  speed  (on  entering  the  sector)  is  drawn  from  a  Gaussian 
distribution  with  mean  0.3  km  sec  1  and  standard  devi.V  '  n 
0.02  km  sec  ^ . 

Initial  estimate  of  target  state  supplied  to  filters  is  a  Gaussian 
perturbation  about  the  true  state.  For  each  Cartesian  co¬ 
ordinate,  standard  deviation  of  velocity  error  is  0.03  km  sec 
and  standard  deviation  of  position  error  is  0.1  km. 

True  measurements  have  a  Gaussian  range  error  with  standard  deviation 

o  =  0.03  km  and  a  Gaussian  bearing  error  with  a.  =  0.01745 
r  0 

radians  ~  1 ° . 

Probability  of  detection  P^  =  1  . 

False  measurements  are  uniformly  distributed  over  the  surveillance 
sector  in  polar  co-ordinates  with  density  p  =  10.0  km  '  radian 


Table  6.2 
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Table  6.2  (concluded) 
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Table  6.3 

PROCESSOR  TIMINGS  AND  ERROR  STATISTIC  E  SECTOR  SCAN 
PROBLEM  WITH  DETERMINISTIC  TRAJECTORIES 


Assumed 

filter 

model 

noise 

•'q 

Tracks 

held 

or 

lost 

Average  cpu 
time  for 
single  step 
(ms) 

Error  statistics 

E 

(km  sec 

CAF 

PDAF 

CAF 

PDAF 

0.010 

H 

1.31* 

989.8* 

505.6* 

L 

1.22 

5922.0 

7371.0 

0.025 

H 

1 .69 

328.0 

184.00 

L 

1 .36 

0.469 

6110.0 

5821.70 

0.050 

H 

1.77 

0.478 

17.95 

65. 7C 

L 

1.78 

0.645 

4075.00 

1825.00 

Weave 

Manoeuvre 

0.100 

H 

2.57 

0.512* 

2.366 

2.752* 

L 

1 .87* 

1.140 

2372.0* 

3.312 

0.250 

H 

113.00 

_ 

0.9630 

L 

221.00 

1 .580 

1.0800 

2.430 

0.300 

H 

507.00 

_ 

0.6860 

_ 

L 

744.00 

1.620 

1 .4750 

2.386 

0.025 

H 

1 .48* 

0.463* 

130.6* 

41.9* 

L 

1.38 

0.478 

19720.0 

15230.0 

0.050 

H 

1.70 

56.87 

626.7 

L 

3.73 

7369.00 

4899.0 

Circular 

0.100 

H 

2.43 

0.495* 

25.05 

15.06* 

Arc 

Manoeuvre 

L 

2.11* 

1 .150 

528.2* 

12.39 

0.250 

H 

101.50 

4.922 

_ 

L 

305.40 

1 .610 

2.932 

3.807 

0.300 

H 

307.00 

_ 

2.701 

_ 

L 

834.00 

1  .640 

2.554 

3.713 

*  Indicates  a  small  sample  (less  than  five  replications) 
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standard  deviation  0  02  km  s’1 

Acceleration  noise  standard  deviation  ; 
vrq”  =  0.Q5km  s~2  5  'g ' 


Fig  6.1  A  sample  of  eight  target  trajectories  from  the  sector  scan  problem 
(note  that  targets  pass  through  the  sector  one  at  a  time) 


Actual  target  position  o 
True  measurements  + 

Kalman  filter  - 

Fig  6.4  An  example  of  tracking  with  the  extended  Kalman  filter 
(parameters  of  Table  6.1,  but  p  =  0) 


REPLICATION  NUMBER 


I  UU 


0 


50 


Fig  6.5  Filter  performance  at  each  replication  for  increasing  density 
false  measurements 


of 


Average  number  of  components 

for  held  tracks  %  of  tracks  maintained 
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Performance  of  CAi-  ana  PuAF  for  sector  scan  problem  as  a  function  of 
(other  parameters  from  Table  6.1) 


uni 


Fig  6.7  Achieved  mean  square  position  error 


Average  number  of  components 

for  held  tracks  7o  of  tracks  maintained 
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1  I  I  I 
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20 


Assumed  filter 
model  noise 
standard  deviation 

-  \fq  =  0. 25  km  s'2 

- — - -  \Tq  =  0-1  km  s~2 

. =  0  05  km  s-2 

- \Tq  =  0.025  km  s’2 

- —  \/“q  =  0.01  km  s-2 


Actual  value  for  trajectory 
disturbance 


Fig  6.9  Mean  square  position  error  for  maintained  CAF  tracks  with  mismatched  q 
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7  THE  DATA  FUSION  PROBLEM 


7 . 1  Introduction 

In  this  chapter  the  problem  of  fusing  information  from  a  number 
of  sources  is  considered,  tor  che  baseline  problem,  the  only  available 
data  are  the  position  measurements  received  from  a  single  sensor  at 
each  time  step.  In  many  practical  cases  an  imperfect  classification  of 
these  position  measurements  (into  true  or  false  categories)  may  be 
available.  A  simple  extension  of  the  baseline  filter  enables  this 
classification  information  to  be  incorporated  into  the  posterior  pdf 
of  target  state  (section  7.2).  It  is  also  possible  that  several 
independent  sensors  may  be  available  to  supply  position  measurements  at 
each  time  step.  Data  from  each  sensor  may  be  incorporated  sequentially 
(section  7.3),  although  this  may  be  time  consuming.  In  section  7.4  we 
derive  a  computationally  efficient  suboptimal  filter  for  combining 
information  from  a  primary  sensor  with  measurements  from  an  auxiliary 
sensor.  In  the  example  considered,  the  auxiliary  sensor  gives  only 
bearing  information  but  does  include  an  imperfect  classification  of 
these  measurements.  The  sub-optimal  filter  uses  the  auxiliary  measure¬ 
ments  to  modify  only  the  probability  weights  of  the  mixture  distribution 
after  updating  from  the  primary  sensor. 

7 . 2  Incorporation  of  classification  data 

7.2.1  Problem  formulation  and  solution 

The  problem  here  is  the  same  as  the  baseline  case  except  that  with 
every  measurement  an  imperfect  classification  feature  d  is  available. 
Thus  at  some  time  step  k  ,  it  is  assumed  that  a  set  of  data  (Z,  D)  is 


received,  where: 
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Z  =  {z.  :  j  =  1  ,  mj 

D  =  {-j  :  j  =  1  »  m} 


(For  convenience  we  shall  omit  the  subscript  k  throughout  this 
chapter.)  Each  classification  feature  is  independent  of  the 

values  of  x  and  Z  ,  although  it  is  known  to  correspond  to  measure¬ 
ment  j  .  The  value  of  d^  depends  only  on  whether  measurement  j  is 

true  or  false.  It  is  assumed  that  the  pdf  of  d.  conditional  on 

“J 

measurement  j  being  true  is  known,  and  it  is  denoted  p(d^|T)  . 
Similarly  the  pdf  of  d^  conditional  on  measurement  j  being  false  is 
known,  and  it  is  denoted  p(d^jF)  .  With  this  knowledge,  it  is  clear 
that  the  data  set  D  may  provide  useful  information  as  to  which,  if 
any,  of  the  m  measurements  is  the  true  one.  We  shall  now  derive  the 
Bayesian  filter  which  makes  use  of  the  classification  features. 


Following  the  reasoning  of  section  2.3.2,  the  posterior  pdf  of  x 
after  incorporation  of  the  latest  sensor  data  (Z,  D) ,  may  be  written: 


n  m 

p  (x  |  Z  ,D)  =  ^  »Z»D)  Pr|jr?jlz,p|  .  (7.1) 


i  =  1  j  =0 


As  in  section  2.3.2,  the  explicit  dependency  on  past  data  &  has  been 

omitted.  First  consider  the  pdf  of  x  conditional  on  &<?'..  .  Since 

-  ij 

the  truth  or  falsehood  of  each  measurement  is  specified  by 
classification  data  does  not  contribute  any  extra  information  (it  is 
independent  of  x) ,  so: 
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4I*V2’D)  ■  p(-^i5’Z)  ’ 


which  is  given  as  usual  by  equation  (2.8).  Thus  the  classification 
data  only  affects  the  weighting  probabilities  of  the  mixture 
distribution.  By  direct  analogy  with  equation  (2.10): 


\  .  )  Pi 

p-r  J  7  T)  >  —  _ _ 

1  pH 

\  H 

w 

irr  •  •  c>  •  Lf  ( 

j  o'  i 

p(Z,D) 

(7.2) 


where  is  a  prior  hypothesis  and  is  the  hypothesis  that 

measurement  j  is  true.  Now  since  D  is  independent  of  Z,  and  D 
depends  only  on  Y .  , 

p(z,D(jr:.J  =  p(z . 


Hence,  comparing  with  equation  (2.10)  it  can  be  seen  that: 


Pr  L 

r: .  |z,dJ 

“  pfl 

> 

)i4'.  , 

)  Pr  ( 

r: .  1  z  \ 

\ 

1J  1 

'  r 

rj  f 

where  Prjj^!^|z|  is  the  usual  probability  weighting  for  the  baseline 
problem  given  by  equation  (2.18),  and: 

r 

m 

p(-jiT)TT  p(-£'F)  for  j  ^  0 

£  =  1 

p^D  4^)  =  <  (7.4) 

m 

TT  p(^:f)  for  j  ■  °  ■ 

jc=i 
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since  the  elements  of  D  are  independent.  Thus  dividing  through  by 
p(D | H'q)  ,  from  (7.3)  we  obtain: 


L(dj)  Prjjrij  !z|/E  if 


j  t  o 


Pr|^!.  |Z,D|  =J 


Pr 


kioiz}/E 


if  j  =  0 


(7.5) 


where  L(d)  =  p(d|T) /p(d|F)  is  a  likelihood  ratio  and: 


n  r  m 

-  I  *  £  e(sJ 

i=1  L  4-1  J 


From  equation  (7.5)  it  is  clear  how  the  classification  data  may  modify 
the  original  probability  weightings  of  the  baseline  problem  through  the 
likelihood  ratio  L(d)  .  As  usual,  an  estimate  of  x  may  be  obtained 
from  equation  (7.1),  and  prediction  forwards  to  obtain  the  prior  pdf 
at  the  following  time  step  follows  from  the  state  propagation  equation 
as  indicated  in  section  2.3.3. 

A  sub-optimal  version  of  the  filter  described  above  may  be 

implemented  using  the  coarse  acceptance  test  and  one  of  the  mixture 

reduction  techniques  of  Chapter  3.  The  filter  was  first  reported  by 
35  . 

Nagarajan  et  al  in  1984  and  was  implemented  using  the  PDAF 
approximation.  Note  that  minimal  extra  computation  over  that  required 
for  the  baseline  problem  is  necessary  to  incorporate  the  classification 


inf ormat ion . 
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The  classification  feature  may  also  be  of  the  discrete  (0,  1) 
type,  such  that: 


and 


if 


if 


d  =  1 


d  =  0  , 


if  d  =  0 


if  d  =  1 


Thus  d  =  1  indicates  that  the  measurement  is  likely  to  be  true  and 

the  probability  PT  of  correctly  recognizing  a  true  measurement  is 

known.  Similarly  d  =  0  indicates  that  the  measurement  is  likely  to 

be  false  and  the  probability  P  of  recognizing  a  false  measurement  is 

r 

known.  For  this  discrete  case,  the  likelihood  ratio  in  equation  (7.5) 
should  be  replaced  by: 


L(d) 


Pr  d 

[il 

Pr  |  d 

Fi 

(7.6) 


Note  that  if  P^  =  P  =  i  ,  then  L(d)  =  1  and  in  this  case,  as 
expected,  the  classification  feature  is  ignored  and  the  posterior 
probabilities  are  unaltered.  If: 


1 


and 


0  <  Pp  <  1  , 


L(0)  =  0 


L<1>  =  l/(!  -  PF) 


In  this  case  the  classifier  always  recognizes  a  true  measurement  but 
sometimes  mistakes  false  for  true.  So  any  hypothesis  for  which  =  0 
(j  /  0)  is  given  a  zero  probability  weighting  via  the  likelihood  ratio. 
If  P  =  1  and  0  <  P  <  1  ,  then  the  classifier  always  recognizes  a 
false  measurement  but  sometimes  mistakes  true  for  false.  In  this  case 
the  likelihood  ratio  defined  by  equation  (7.6)  is  not  defined  when 
d  =  1  and  so  it  is  not  valid  to  divide  through  by  Pr^D  i  'I'q  ^  in 
equation  (7.3)  if  any  element  of  D  is  unity.  However  each  probability 
weighting  Pr|jT!j  I  Z,d|  contains  the  factor: 


Prio  |  'V .  j 

_ 

Pr/djY.f  , 

<  j/ 

II  l  V  3> 

1=1 

.  element 

of 

D  such  that  d^  =  1  ,  then 

only  for 

j 

=  l  .  Thus  the  true  measurement 

identified.  Since  the  classifier  always  recognizes  false  measurements 
and  there  is  at  most  one  true  measurement,  only  one  element  of  D  can 
be  unity.  However  if  P^  is  less  than  one,  the  true  measurement  may 
not  be  recognized,  so  that  all  elements  of  D  may  be  zero.  In  this 
case  Pr|D!Vj|  is  constant  for  j  ^  0  .  Note  that  if  P^  =  1  ,  the 
classifier  will  pick  out  the  correct  hypothesis  on  100  P^  %  of 

occasions  when  the  true  measurement  is  present.  Thus  in  a  high  density 
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of  false  measurements,  a  perfect  false  measurement  discriminator  may 

well  be  more  useful  than  a  perfect  true  measurement  discriminator.  If 

P  =  P  =  1  ,  the  correct  hypothesis  is  always  identified, 
r  1 

7.2.2  Simulation 


To  demonstrate  the  possible  improvement  in  tracking  performance 
when  classification  data  is  available,  the  baseline  example  of 
section  4.2  has  been  extended  to  include  a  discrete  (0,  1)  type 
discriminator.  The  problem  parameters  used  in  Chapter  4  are  retained: 


pr  =  0.012  ,  PD  =  1 


and  the  performance  of  the  classifier  is  defined  by  P  and  P  as 

i  r 

indicated  above.  Mixture  reduction  is  carried  out  using  either  the 
PDAF  or  the  Clustering  Algorithm  with  =  20  . 

Fig  7.1  shows  the  track  survival  time  N  and  the  average  number 
a  AVh 

of  mixture  components  generated  as  a  function  of  P  when  P  =  P  . 

1  FT 

Also  the  average  computation  time  per  step  is  recorded  in  Table  7.1. 

As  expected,  N  increases  with  the  probability  of  correct 

classification  and  useful  performance  improvement  may  be  obtained  even 

with  a  mediocre  discriminator.  For  example  with  P  =  P  =  0.7  ,  track 

lifetime  of  the  CAF  is  increased  by  a  factor  of  2.5,  although  for  the 

PDAF  substantial  improvement  is  not  obtained  until  P  =  P  =  0.8  , 

T  r 

when  the  improvement  factor  for  both  filters  is  about  3.4.  Also  the 
average  processing  time  per  step  (Table  7.1)  and  the  number  of  mixture 
components  generated  decrease  as  the  performance  of  the  discriminator 


imp  roves . 


This  is  because  the  discriminator  tends  to  suppress  incorrect 
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hypothesis,  and  this  helps  to  keep  the  acceptance  region  small  so  that 
fewer  components  are  generated. 

Fig  7.2  shows  the  effect  of  varying  P  with  P  fixed  at  0.99, 

T  F 

and  Fig  7.3  shows  results  for  varying  P  with  P  =  0.99  .  The 

F  T 

corresponding  results  for  the  CAF  are  similar  in  these  two  cases, 
although  the  PDAF  performs  significantly  better  for  small  values  of 
P  with  P  =  0.99  than  for  small  values  of  P  with  P  =  0.99 

I  r  r  T 

(see  previous  section) . 

In  each  of  Figs  7.1  to  7.3,  the  CAF  track  lifetime  is  always 

several  times  longer  than  that  of  the  PDAF.  However  as  P  and  P 

r  T 

increase,  the  difference  in  performance  between  the  two  filters 
decreases  (of  section  5.2). 

7 . 3  Multiple  sensors  without  classif  icaticr.  data 

7.3.1  Problem  statement 

In  this  section  the  baseline  problem  is  extended  to  multiple 
sensors.  Each  of  these  has  similar  characteristics  to  the  sensor 
described  in  Chapter  2  and  no  classification  data  is  available.  It  is 
assumed  that  there  are  Ng  independent  sensors  and  that  at  each 
time  step  k  ,  each  sensor  u  produces  m^  measurements: 

Z  =  t  z  •  :  j  =  1  ,  m  > 

u  (-UJ  J  ’  U) 

For  each  sensor  u  : 

(i)  At  most  one  true  measurement  is  produced  with  probability 

P„  .  This  true  measurement  is  an  indpendent  sample  from  the 
Du 

Gaussian  pdf  *  (z;  H  x ,  T  )  . 

u-  J 


(ii)  False  measurements  are  uniformly  distributed  over  the 
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surveillance  region  of  the  sensor.  The  density  of  false 

measurements  is  p 

u 

Since  each  sensor  is  independent,  data  from  each  sensor  may  be 
incorporated  sequentially  using  the  update  relations  of  section  2.3.2. 
This  is  convenient  since  the  computer  code  for  the  single  sensor 
problem  may  be  employed  with  only  minor  modifications.  This  recursive 
solution  is  quite  straightforward  but  for  completeness  it  is  included 
in  Appendix  E. 

In  any  implementation  of  the  filter  it  is  necessary  to  control 
the  proliferation  of  hypotheses.  Depending  on  the  density  of  false 
measurements,  it  may  be  feasible  to  apply  a  mixture  reduction  algorithm 
only  once  per  time  after  measurements  from  all  sensors  have  been 

processed  (Fig  7.4b).  In  this  case  the  order  in  which  sensors  are 
processed  is  irrelevant.  Alternatively  it  may  be  desirable  to  carry 
out  reduction  after  processing  measurements  from  each  sensor  (Fig  7.4a). 
In  this  case  the  order  in  which  sensors  are  processed  may  affect  the 
performance  of  the  filter.  In  the  following  section,  these  points  are 
investigated  by  simulation  for  a  two  sensor  filter. 

7.3.2  Simulation  example:  a  two  sensor  filter 

To  demonstrate  the  performance  benefits  that  may  be  obtained  with 
multiple  sensors,  the  operation  of  a  two  sensor  filter  has  been 
simulated  for  the  tracking  problem  of  section  4.2.  The  first  sensor 
has  parameters: 


=  qit 


C.012 

4 

q-t 


D 1 
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so  that  without  sensor  2,  the  tracking  problem  would  be  identical  to 
the  example  of  Chapter  4.  The  second  sensor  is  of  the  same  type  but 
may  have  different  values  for  the  parameters  r^,  P2  and  P^,  •  To 
facilitate  comparison  with  the  single  sensor  filter,  the  track  loss 
criteria  are  identical  to  those  given  in  section  4.3  and  are  based 
solely  on  sensor  1.  Thus  track  loss  through  rejection  of  true 
measurements  is  only  tested  for  sensor  1  and  the  tracking  error 
reference  is  derived  from  the  equivalent  Kalman  filter  based  on 
sensor  1  only  (with  p  =0). 

As  indicated  above,  data  sets  from  each  sensor  are  incorporated 

sequentially.  The  two  schemes  for  mixture  reduction  shown  in  Fig  7.4 

have  both  been  investigated  using  the  Clustering  Algorithm  (CA)  with 

the  usual  thresholds  and  NT  =  20  .  Also  performance  with  the  PDAF 

approximation  has  been  studied.  The  PDAF  must  be  applied  directly 

after  processing  each  sensor  as  retention  of  more  than  one  component 

is  not  possible  with  this  algorithm.  This  technique  for  incorporating 

multiple  sensors  using  the  PDAF  has  been  implemented  by  Houles  and 
14 

Bar-Shalom 

In  the  tracking  simulation  the  parameters  of  sensor  2  were 

nominally  chosen  to  have  the  same  values  as  those  of  sensor  1  and  then 

each  of  the  parameters  r9,  P2  and  Pp?  were  varied  in  turn.  For 

reduction  via  the  Clustering  Algorithm,  the  average  track  survival 

times  N  (for  100  replications)  with  95%  confidence  limits  are 
AVb 

shown  in  Figs  7.5  to  7.7.  For  each  set  of  parameters,  N.,_  is  shown 

AVE 

for  the  two  sensor  filter  with  reduction  after  processing  both  sensors 
(labelled  TB)  and  with  reduction  after  processing  each  sensor 
(labelled  T 1 2  when  sensor  1  is  processed  first  and  labelled  T21  when 
sensor  2  is  processed  first).  Also  results  for  the  single  sensor 
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filter  using  only  sensor  1  (labelled  SI)  and  using  only  sensor  2 
(labelled  S2)  are  shown  for  comparison.  To  maintain  consistency,  the 
tracking  error  reference  of  the  track  loss  criterion  for  S2  is  derived 


from  the  equivalent  Kalman  filter  based  on  sensor  1  (with  =0). 

The  average  number  of  mixture  components  before  and  after  application 
of  the  Clustering  Algorithm  is  also  shown  in  the  figures.  When  the 
Clustering  Algoithm  is  applied  twice  on  each  time  step  (for  T12  and  T2 1 )  , 
both  of  these  applications  are  included  in  the  averages.  Figs  7.8 
to  7.10  show  similar  results  for  the  PDAF  approximation,  except  TB 
using  the  Clustering  Algorithm  has  been  included  for  comparison. 

Average  cpu  time  per  time  step  is  given  in  Table  7.2  for  all  of  these 
results . 

In  many  cases,  employing  two  sensors  gives  an  increase  in  track 

lifetime  N  ^  with  respect  to  a  filter  using  measurements  from  only 

one  of  the  two  sensors.  The  greatest  improvement  factor  is  obtained 

when  the  two  sensors  are  identical,  for  which  N  from  TB  exceeds 

AVE 

from  the  single  sensor  CAF  by  a  factor  of  about  8.5.  When  there 

AVE 

is  a  large  discrepancy  between  the  quality  of  the  two  sensors,  the 
performance  of  the  two  sensor  filters  does  not  usually  differ 
significantly  from  the  best  of  the  single  sensor  filters.  However 
in  two  cases  where  the  track  lifetime  of  S2  is  greater  than  15  times 
that  of  SI  (s  0  =  0.001/qlt^  and  c?  =  0.002/qAt'*  in  Figs  7.6  and  7.9), 
the  two  sensor  filter  T12  is  outperformed  by  S2  for  both  the  CAF  and 
the  PDAF.  In  each  of  these  two  cases,  TB  using  the  Clustering 
Algorithm  is  still  better  than  the  CAF  using  sensor  2  alone.  Thus 
it  appears  chat  when  the  quality  of  the  two  sensors  is  very  dissimilar, 
it  is  important  to  retain  the  detailed  structure  of  the  mixture  between 
processing  data  f roc  the  sensors.  Presumably  this  allows  the  good 


sensor  to  selectively  reinforce  or  suppress  components  generated  by  the 
poor  sensor. 

In  all  cases  track  lifetime  for  TB  is  greater  than  or  not 

significantly  different  from  N  for  T12  or  T21 ,  for  the  Clustering 

Algorithm.  Also  TB  which  uses  the  Clustering  Algorithm,  always  gives 

a  track  lifetime  at  least  five  times  longer  than  that  of  T12  or  T21 

using  the  PDAF .  For  both  the  CAF  and  the  PDAF,  N  ^  from  Tl2  is 

usually  similar  to  N  from  T21.  When  a  significant  difference 

AVE 

does  occur,  the  longer  track  survival  time  is  usually  obtained  when  the 
better  sensor  is  processed  first.  The  one  exception  to  this  is  for  the 
CAF  with  r2  =  0.01  qAt4  (Fig  7.5). 

For  the  Clustering  Algorithm,  the  average  cpu  time  per  step  for 
the  two  sensor  filters  is  almost  always  much  less  than  that  of  the  CAF 
employing  only  the  poor  sensor,  and  greater  than  the  CAF  employing  only 
the  good  sensor  (see  Table  7.2).  For  identical  sensors,  the  cpu  time 
per  step  for  TB  is  16%  greater  than  that  of  the  single  sensor  CAF, 
while  T12  and  T21  give  a  25%  saving  in  cpu  time.  These  computation 
times  are  closely  related  to  the  average  number  of  mixture  components 
generated  by  the  filters.  The  effect  on  cpu  time  of  incorporating  a 
second  sensor  is  broadly  similar  for  the  PDAF  approximation,  except 
that  for  <  1  ,  the  two  sensor  filters  are  slower  than  the  PP'F 

using  sensor  2  alone  (which  performs  very  poorly) . 

It  should  be  remembered  that  the  above  observations  only  apply 
to  the  example  simulated  here.  However  it  is  quite  likely  that  the 
broad  conclusions  apply  to  a  wide  range  of  examples.  Detailed  results, 
such  as  the  percentage  of  cpu  time  saved  by  employing  two  identical 
sensors  rather  than  one  of  them,  are  likely  to  be  problem  dependent. 
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7.4  Incorporation  of  data  from  an  auxiliary  sensor  with  a 
classification  capability 


7.4.1  Problem  statement 

This  data  fusion  example  has  been  chosen  to  show  how  data  from  a 
secondary  or  auxiliary  sensor  may  be  used  to  assist  a  primary  sensor 
with  modest  changes  to  the  tracking  filter.  This  example  is  an 
extension  of  the  sector  surveillance  problem  of  Chapter  6,  and  as 
already  described  it  is  assumed  that  the  primary  sensor  produces 
measurements  in  polar  co-ordinates.  The  auxiliary  sensor  produces 
bearing  only  measurements,  but  a  classif ication  flag  is  associated  with 
each  of  these.  Since  the  auxiliary  sensor  does  not  supply  range,  on  its 
own  it  would  give  poor  tracking  performance.  The  auxiliary  sensor  is 
co-located  with  the  primary  sensor  at  the  origin  and  measurement  sets 
are  produced  coincidently  by  both  sensors.  It  can  be  seen  that  this 
problem  includes  elements  from  each  of  the  previous  sections. 

The  auxiliary  sensor  produces  false  measurements  which  are 
uniformly  distributed  in  bearing  over  the  surveillance  sector  with  a 
density  p ^  per  radian.  A  true  measurement  has  a  Gaussian  distribution 
about  the  actual  target  bearing  with  a  standard  deviation  of 
radians,  and  the  probability  cf  detecting  the  target  is  P^.,  .  The 
classification  flag  associated  with  each  measurement  is  of  the  discrete 
(0,  1)  type.  A  value  of  one  indicates  that  the  measurement  has  been 
classif ied  true,  while  zero  indicates  that  it  has  been  classified  false. 
The  probability  of  correctly  recognizing  a  true  measurement  is  P^,  , 
and  P  is  the  probability  of  correctly  recognizing  a  false  measurement. 
As  in  section  7.2,  the  classification  flag  is  independent  of  the  value 
of  the  measurement. 


! «  3 

7.4.2  A  sub-optimal  filter 

The  main  idea* behind  the  design  of  this  filter  is  to  use  data 
from  the  auxiliary  sensor  to  modify  only  the  probability  weights  of  the 
mixture  distribution  resulting  from  the  primary  sensor  measurements. 

So  the  auxiliary  sensor  data  is  to  be  used  either  to  reinforce  or  to 
weaken  the  weightings  of  the  mixture  components.  The  mixture  components 
themselves  are  not  changed.  This  approach  avoids  the  usual  splitting 
of  components  when  measurements  from  the  second  sensor  are  incorporated. 

After  processing  the  measurements  from  the  primary  sensor  at 

some  time  step,  the  posterior  pdf  of  x  is  given  by  (following 
section  2.3.2)  : 


m, 

n  1 

r“»  r-' 


(*|zi)  "  L  L  p(5l*ir  zi)  Pr(*ir  *^1)  • 


i=1  j=0 


(7.7) 


where  v . •  is  a  hypothesis  on  the  measurements  from  the  primary  sensor 
(see  Appendix  E) .  Since  the  sensor  measurements  are  in  polar  co¬ 
ordinates,  the  extended  Kalman  filter  approximation  is  used  to  evaluate 
the  components  an^  ^oility  weights  of  this  Gaussian  mixture  (see 

section  6.1).  The  data  from  the  auxiliary  sensor  is  denoted: 


(Z2*  ^2)  ’ 


where  D0  is  the  set  of  (0,  1)  V  pe  classification  features: 


After  applying  Bayes  theorem  and  deleting  redundant  dependencies  it  can 
be  shown  that: 


and  E  is  the  normalizing  denominator  chosen  so  that  the  summation  of 
the  RHS  of  equation  (7.10)  over  i  and  j  is  unity.  (Note  that  D? 
is  independent  of  the  past,  so  Pr  !  ^2  ^  ^oes  not  i-nclude  a 

dependency  on  .)  Thus  the  resulting  filter  is  the  same  as  the 

usual  single  sensor  filter  except  that  each  probability  weighting  is 
modified  by  the  factor: 


For  the  problem  of  section  7.4.1,  the  auxiliary  sensor  data 
consists  of  bearing  measurements,  each  with  an  associated  classification 


flag.  Thus  using  an  amalgam  of  results  from  sections  2.3.2,  6.1,  7.2.1 
and  Appendix  E  it  can  be  shown  that  equation  (7.11)  is  given  by: 
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if  1  =  0 


(7.12) 


where  E'  is  the  normalizing  denominator,  chosen  so  that: 
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Also  6-.  is  the  auxiliary  bearing  measurement  4.  6..  is  the 

2 1  ij 

expected  value  of  the  true  auxiliary  measurement  under  hypothesis 
( b  1  j  ,  JF/) ,  and  it  is  given  by: 


6 .  .  =  tan 

ij 


-1 


(y../x..) 
v  ij  ij/ 


where  (x..,  y . .)  is  the  mean  target  position  of  the  mixture  component 

2 

of  equation  (7.7)  corresponding  to  hypothesis  ((j^.,  JpO  .  is  the 

variance  of  the  innovation  (9_,  -  6..)  under  hypothesis  ('!..,  )  ,  and 

24  ij  1  j  n  i 

from  equations  (6.5)  and  (6.6)  it  is  given  by: 
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where  r‘." . 
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x .  .  +  y .  . 
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and  p.,,  P-t  and  p,_  are  elements  of  the  symmetric  matrix  P.. 

1 1  33  13  ij 

which  is  the  covariance  of  the  mixture  component  of  (7.7) 

corresponding  to  hypothesis  (<f>^,  •  If  Is  large,  then  there 

is  likely  to  be  a  large  uncertainty  in  the  association  of  auxiliary 

measurement  to  mixture  component,  and  the  extra  data  is  unlikely  to  be 

informative.  However  if  a..  is  small  so  that  the  Gaussian  factor  in 

ij 

equation  (7.12)  is  selective,  the  auxiliary  data  may  provide  useful 
extra  information. 


From  above  it  can  be  seen  that  certain  elements  of  the  mean  x.. 

-ij 

and  the  covariance  of  each  component  of  equation  (7.7)  are 

required  for  the  evaluation  of  F..  .  These  terms  are  already 

IJ  J6 

available  for  an  implementation  using  the  Clustering  Algorithm,  so 

that  incorporating  the  auxiliary  sensor  data  is  a  small  computational 

overhead.  However,  for  the  standard  PDAF,  it.,  and  P..  are  not 

-ij  ij 

explicitly  evaluated  and  so  for  this  filter  the  extra  computation  requirement 
is  significant.  To  reduce  the  processor  load  for  the  PDAF  it  is 
suggested  that  components  with  very  low  probability  weights  are 
discarded  before  calculating  the  modifying  factor: 


1=0 


In  the  simulation  of  the  following  section,  components  with 
probability  weights  below  0.001  are  ignored  for  the  PDAF.  Also  for 
both  the  CAF  and  the  PDAF,  an  acceptance  test  is  applied  to  the 
auxiliary  measurements  for  each  component  of  equation  (7.7).  Mixture 
reduction  is  applied  after  modifying  the  probability  weights,  and 
prediction  forwards  to  the  next  time  step  follows  as  usual  from  the 
state  propagation  equation. 

7.4.3  Simulation 

Simulation  studies  have  been  carried  out  to  demonstrate  the 
possible  improvement  in  tracking  performance  through  sub-optimal 
processing  of  auxiliary  sensor  measurements .  The  standard  parameters 
of  Table  6.1  have  been  assumed  for  the  target  trajectory  and  for  the 
primary  sensor,  except  that  the  density  of  false  measurements  for  the 
primary  sensor  has  been  increased  to  p  =  30  km  1  rad  1  .  The 
performance  of  the  auxiliary  sensor  is  described  by  five  parameters: 

(i)  the  standard  deviation  of  the  true  measurement  bearing 

error  (radians) , 

(ii)  the  density  of  false  measurements  p ^  (radians  ^), 

(iii)  the  probability  of  correctly  recognizing  a  true 

measurement  P^  , 

(iv)  the  probability  of  correctly  recognizing  a  false 

measurement  P_  . 

F 

(v)  the  probability  of  detecting  the  target  Pq9  . 

For  this  simulation  we  have  set  PT  =  Pp  ,  and  the  following  standard 
set  of  parameters  for  the  auxiliary  sensor  has  been  chosen: 
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=  0.01745  rad 

=  45  rad  ^ 

=  Pr  =  0.9 

F 

=  1  . 

Thus  the  standard  deviation  of  the  true  measurement  o0  is  the  same 
as  for  the  primary  sensor,  and  is  related  to  the  density  of 

primary  sensor  false  measurements  by: 

P2  =  12  p(r2  ~  rl)  ’ 

where  r9  -  r^  =  18  km  is  the  range  extent  of  the  surveillance  sector. 

Each  of  the  parameters  a and  has  been  varied  in  turn  while 

keeping  the  parameters  of  the  primary  sensor  fixed.  Figs  7.11  to  7.13 

show  the  percentage  of  tracks  maintained  by  the  Auxiliary  Sensor  filter 

out  of  100  replications  for  each  set  of  parameters  tested.  For  the 

track  maintenance  criterion  of  (6.1),  a  and  a  are 

x  y 

obtained  from  the  equivalent  Kalman  filter  based  on  the  primary  sensor 
only.  The  average  processing  time  for  a  single  step  and  the  error 
statistic  E  (see  section  6.3.1)  are  given  in  Table  7.3. 

Figs  7.11  to  7.13  clearly  show  that  the  Auxiliary  Sensor  filter 

can  give  a  significant  performance  improvement  over  the  primary  sensor 

alone.  This  is  most  apparent  for  the  PDAF  which  can  only  retain  1%  of 

the  tracks  without  the  auxiliary  sensor.  As  would  be  expected, 

performance  deteriorates  with  increasing  o0  and  p^  ,  so  as  these 

parameters  become  large,  performance  approaches  the  primary  sensor  alone 

case  (Figs  7.11  and  7.12).  Also  filter  performance  improves  as  P^ 

increases  (Fig  7.13).  For  the  case  P  =  P  =  0.5  ,  the  classifier 

T  r 


supplies  no  useful  information.  However  for  P  =  P  =  1  ,  the  true 
auxiliary  measurement  is  always  identified  so  that  the  presence  of  fals 
auxiliary  measurements  is  irrelevant  (cf  filter  performance  for 

=  0.36  rad  ^  in  Fig  7.12  for  which  false  measurements  are  sparse). 

Table  7.3  shows  that  if  the  performance  of  the  auxiliary  sensor 
is  good  (o0  or  low,  or  P^  high),  the  incorporation  of  the  extra 

data  reduces  the  average  computation  time  for  the  CAF:  the  extra 
information  enables  the  filter  to  reduce  the  number  of  retained 
components  (see  Figs  7.11  to  7.13).  Processing  time  is  always  greater 
for  the  Auxiliary  Sensor  PDAF  than  for  the  standard  single  sensor  PDAF. 
This  is  because  the  mean  and  covariance  of  each  mixture  component  must 
be  explicity  calculated  for  the  Auxiliary  Sensor  PDAF  implementation 
(see  previous  section).  When  the  density  of  the  auxiliary  false 

measurements  is  large,  the  processing  times  for  the  Auxiliary  Sensor 
filters  are  several  times  greater  than  those  of  the  standar  filters. 

Examination  of  the  error  statistic  F  in  Table  6.3  showa  that 
for  lost  tracks,  the  filters  significantly  underestimate  their  tracking 
error  (aa  is  also  the  case  for  the  standard  filters,  see  section  6.3.1) 
For  CAF  held  tracks,  with  the  exception  of  the  cases  c ^  ~  0.005  rad 
and  o0  =  0.04  rad  ,  E  is  always  within  50%  of  the  'correct'  value 
of  four.  However  for  the  PDAF,  the  values  of  E  show  a  much  greater 
spread  about  four,  with  a  tendency  for  E  to  increase  with  the 
performance  of  the  auxiliary  sensor. 

7 . 5  Cone lusions 

In  this  chapter  we  have  shown  how  Bayesian  filters  may  be  applied 
to  the  data  fusion  problem.  Incorporating  data  from  an  extra  sensor  or 
an  imperfect  measurement  classifier  may  significantly  improve  tracking 


performance  and  reduce  processing  time.  However  if  the  performance  of 
the  additional  sensor  is  very  inferior  to  the  original  sensor,  a  large 
processing  overhead  may  result  in  only  a  minor  performance  improvement 


Table  7. 1 

PROCESSOR  TIMINGS  FOR  FILTERS 


WITH  CLASSIFICATION  FLAG 


Classification 

parameters 

Average  cpu 
time  for 
single  step 
(ms) 

PT 

PF 

CAP 

PDAF 

0.50 

0.50 

5.930 

1.120 

0.60 

0.60 

5.420 

1.140 

0.70 

0.70 

3.710 

0.659 

0.80 

0.80 

2.210 

0.460 

0.90 

0.90 

1  .070 

0.206 

0.95 

0.95 

0.675 

0.199 

0.99 

0.99 

0.519 

0.195 

0.30 

0.99 

3.240 

0.379 

0.50 

0.99 

1  .920 

0.236 

0.90 

0.99 

0.679 

0.199 

0.95 

0.99 

n.594 

0.196 

0.99  0.3 

0.99  0.5 

0.99  0.9 

0.99  0.95 


3.240 

2.010 

0.677 

0.581 


0.813 

0.6’3 

0.198 

0.197 


Table 


1510.000  28.900  51.300  7/4.50  "  6.500  5.137  5.710 


PROCESSOR  TIMINGS  AND  ERROR  STATISTIC  E  FOR  AUXILIARY  SENSOR  FILTER 


Parameters  of 
auxiliary  sensor 

Tracks 

held 

or 

lost 

Average  cpu 
time  for  single 
step  (ms) 

Error  statistic 

E 

°2 

(rad) 

P2 

(rad-"' ) 

P  =  P 

T  F 

CAF 

PDAF 

CAF 

PDAF 

H 

15.00 

llllllM 

3.660 

3.142* 

Primary 

sensor  only 

L 

52.80 

m 

2081 .000 

6.694 

0.005 

45.00 

0.9 

H 

9.11 

22.930 

16.490 

L 

9.33 

2779.000 

566.600 

0.01 

M 

II 

H 

10.00 

2.94 

4.262 

71 .080 

L 

9.57 

3.95 

2419.000 

444.100 

0.01745 

It 

II 

H 

1 1  .50 

3.85 

3.691 

27.560 

L 

1 1  .50 

4.96 

1713.000 

258.400 

0 . 04 

If 

II 

H 

14.50 

5.73 

435.420 

5.581 

L 

14.60 

7.37 

1742.000 

88.180 

0.07 

II 

II 

H 

16.90 

8.32 

3.558 

4.346 

L 

22.80 

9.61 

3069.000 

128.300 

0.  1 

II 

II 

H 

18.60 

5.93 

3.601 

4.204 

L 

22.70 

12.60 

2784.000 

69.230 

0.2 

II 

It 

H 

21.25 

10.37* 

3.643 

3.339* 

L 

64.22 

19.22 

2159.000 

71.820 

0.01745 

0.36 

0,9 

H 

8.54 

1.95 

6.106 

17.490 

L 

7.72 

2.34 

1377.000 

39.650 

0.01745 

1  .80 

II 

H 

8.64 

1  .97 

3.718 

18.240 

L 

8.27 

2.41 

1344.000 

199.300 

0.01745 

9.00 

II 

H 

9.20 

2.23 

3.707 

10.650 

L 

9.52 

2.67 

6627.000 

422.200 

0  01745 

45.00 

fl 

R 

1 1.50 

3.85 

3.691 

27.560  ■ 

L 

1 1 .50 

4.96 

1713.000 

258.400 

0.01745 

180.00 

II 

H 

19.00 

10.30 

3.590 

2.711 

L 

20.90 

18.60 

2362.000 

95.980 

i 

0.01745 

720.00 

II 

H 

44.40 

23.90 

3.927 

2.678 

L 

126.00 

74.00 

2522.000 

96.390 

0.01745 

1440.00 

If 

H 

76.20 

68.50 

3.749 

21.140 

L 

79.40 

145.00 

2495.000 

155.400 

Table  7.3  (concluded) 


Parameters 

of 

auxilliary  sensor 

°2 

P2 

(rad) 

(rad-1 ) 

F  =  ? 

T  F 

0.01745 

45.0 

0.5 

It 

V 

0.6 

It 

It 

0.7 

II 

It 

0.8 

ft 

If 

0.9 

ft 

ft 

0.95 

It 

It 

0.99 

It 

It 

1  .0 

Tracks 

held 

or 


Average  cpu 
time  for  single 
step  (ms) 

Error  statistic 

E 

CAF 

PDAF 

CAF 

PDAF 

15.00 

4.44 

2.408 

21.60 

9.19 

41.770 

16.70 

4.41 

3.585 

2.383 

18.10 

9.08 

3012.000 

23.730 

14.20 

5.69 

3.607 

3.677 

15.00 

8.10 

6104.000 

41 .760 

12.90 

4.09 

3.585 

2.761 

14.70 

6.79 

2896.000 

190.900 

11.50 

3.35 

3.691 

27.560 

11.50 

4.96 

1713.000 

258.400 

10.70 

3.17 

3.745 

31 .050 

10.50 

4.22 

2183.000 

201.500 

9.87 

2.89 

3.711 

16.210 

9.55 

3.70 

1 622.000 

204.000 

9.48 

2.76 

3.836 

20.030 

9.92 

3.31 

3709.000 

388.900 

*  Indicates  a  small  sample  (less  than  five  replications) 
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8  MULTIPLE  MEASUREMENT  CLASSES:  THE  PROBLEM  OF  INTERFERING 

MEASUREMENTS 

8 . 1  Introduct ion 

In  the  preceding  chapters  it  has  been  assumed  that  measurements 
are  either  true  or  false,  and  that  at  most  one  of  the  measurements  from 
a  single  sensor  may  be  true  at  any  time  step.  The  problem  is  now 
extended  to  allow  further  classes  of  measurement  which  may  or  may  not 
be  associated  with  the  target.  The  formal  Bayesian  solution  to  this 
new  problem,  which  is  given  in  the  following  section,  is  a  straightforward 
extension  of  the  baseline  filter.  However,  except  for  simple  cases, 
it  is  not  easy  to  apply  this  general  solution  to  derive  practical 
filters  for  specific  tracking  examples.  Thus  to  arrive  at  useful 
recursive  filters  it  may  be  necessary  to  impose  rather  crude  approxi¬ 
mations  . 

In  section  8.3  a  tracking  problem  with  three  measurement  classes 
is  described.  Two  of  these  classes  are  the  usual  true  and  false 
measurements,  while  the  third  class  consists  of  interfering  measurements 
associated  with  the  target  position.  As  an  extra  complication  this 
interference  is  intermittent  and  its  switching  on  and  off  may  be 
modelled  by  a  Markov  process.  A  practical  sub-optimal  tracking  filter 
has  been  derived  from  the  general  solution  of  section  8.2  by  making 
several  approximations. 

8 . 2  Problem  formulation  and  general  solution 

At  each  time  step  a  set  of  measurements  Z  is  received: 


classes,  and 


Each  measurement  z  of  Z  may  belong  to  any  one  of 
the  class  membership  of  z  may  be  unknown.  However  if  z  does  belong 
to  class  Cj  ,  then  z  is  an  independent  sample  from  the  pdf: 

p(z|x  ,  CjJ  ,  (8.1) 

which  is  assumed  to  be  available.  It  is  also  assumed  that  the 
probability  distribution  of  the  number  of  received  measurements  from 
each  class  is  given.  Thus  the  probability  of  receiving  mj  measure¬ 
ments  belonging  to  class  j  is  known  and  is  denoted  g^(mj)  .  Note 
however  that  mj  is  in  general  not  known  and  that  the  membership  of 
each  class  may  only  be  hypothesized.  Clearly: 


N 

c 

)  ra!  =  m 

L ,  3 

j“1 

As  usual  the  state  propagation  equation  is  given  by  equation  (2.1)  and 
the  problem  is  to  obtain  the  posterior  pdf  of  x  at  each  time  step. 

To  solve  this  problem,  following  section  2.3.2,  it  is  necessary 
to  construct  all  feasible  measurement  association  hypotheses  and 

so  to  evaluate  the  posterior  pdf  of  x: 


p(x ! Z) 


^  p(x]jr,Z)  Pr|^'  |z| 
AiiiT' 


(8.2) 


(The  time  step  subscript  k  and  explicit  dependency  on  are  omitted 

in  the  chapter,  although  the  conditioning  should  be  understood  throughout.) 
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This  equation  is  similar  to  equation  (2.19)  and  as  usual: 

* 

=  (#;  v)  , 

is  a  joint  hypothesis,  where  is  a  hypothesis  on  the  class  member¬ 

ship  of  data  received  up  to  and  including  the  previous  time  step,  and 
V  is  an  association  hypothesis  on  the  current  measurement  set  Z  . 

Also  we  assume  that  Pr|.^|  and  p(x|^0  are  available  from  the 
previous  recursion. 

First  consider  p(x|^”,Z)  .  From  Bayes  theorem: 

p (Z | x,^’1 )  p(xljT') 

pCxjjr'.z)  =  - -  •  (8-3) 


Suppose  that  T  assigns  the  ith  member  of  Z  to  class  ,  then 

since  the  members  of  Z  are  independent: 


m 

p(Zix,^”)  =  |j  p/*il5.Cf(i)J  .  (8.4) 

i=1  ' 


Also 


p(xljr')  =  p(xlJF) 


which  is  available  from  the  previous  recursion.  The  denominator  of  the 
RHS  of  equation  (8.3)  is  given  by: 

f p(z!x,jT)  p(xljT')  dx  .  (8.5) 


Thus  in  principle  p(xi.^',Z)  can  be  found.  In  practice  it  is  likely 
to  be  difficult  to  find  a  simple  analytical  expression  unless  the 
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underlying  distributions  are  Gaussian  or  the  measurements  are 
independent  of  x  for  many  classes  (as  in  the  baseline  problem) 


Now  consider  the  posterior  probability: 


Prj^'|zj  =  Pr^,v\zj 


This  is  given  by  equation  (2.10): 


:(r'iz)  = 


>(zj^”)  PrjyjJ^]  Prl^H 

pTz) 


(8.6) 


Since  the  members  of  Z  are  independent,  following  equation  (2.11)  we 


have : 


p(ZljT')  =  (p(Z|x,40  p(x!^)  dx 


m 

=  j  ~ J~  |x»cf  p(il^)  dx 


The  factor  Pr|'t,!^’|  is  the  prior  probability  of  V  ,  and  since  this 
is  independent  of  hypotheses  on  data  from  previous  time  steps: 

Prj'flirj  =  Pr  {’?}  . 


The  evaluation  of  this  probability  depends  on  what  prior  information  is 

available  on  the  class  membership  of  the  measurements  Z  .  However  it 

is  known  that  the  probability  of  receiving  mj  measurements  belonging 

to  class  C.  is  g.(m!)  .  Thus  the  joint  prior  probability  that  m! 

J  J  J  J 

measurements  belong  to  class  C.  for  j  =1  ,  ...  ,  N  is: 

&  j  c 
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where 


m 


If  there  are  no  prior  restrictions  on  the  class  membership  of  measure¬ 
ments,  then  the  number  of  hypotheses  ¥  that  could  have  caused  this 
distribution  of  measurements  is: 


m 


TT 

r 


So,  since  a  priori  each  of  these  hypotheses  is  equally  probable: 


Prjv 


(8.8) 


where  m!  is  the  number  of  measurements  assigned  to  class  C.  under 
J  J 

hypothesis  ¥  .  If  the  class  membership  of  the  measurements  is  restricted, 
there  may  be  fewer  hypotheses  corresponding  to  this  distribution  of 
measurements,  in  which  case  equation  (8.8)  must  be  amended.  The  final 
factor  Pr|^j  is  available  from  the  previous  recursion,  and  the 
denominator  of  equation  (8.6)  is  given  by: 
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Thus  again  all  the  required  functions  are  available  and  in  principle 
the  required  solution  may  be  obtained  by  substituting  into  equation  (8.6). 
The  prediction  forwards  to  obtain  the  prior  pdf  at  the  following  time 
step  follows  from  the  propagation  equation  of  the  state  vector,  as 
indicated  in  section  2.3.3. 

To  show  that  this  general  solution  may  be  reduced  to  the  baseline 
problem,  suppose  that  there  are  only  two  classes  of  measurements,  true 
and  false.  At  most  one  of  the  measurements  may  be  true  (class  C^)  and 
the  probability  distribution  of  the  number  of  true  measurements  is  given 
by: 


(8.10) 


Also  from  equation  (2.2),  a  true  measurement  is  a  Gaussian  distribution 
about  Hx  : 


_  f  (z  ;Hx  ,R) 


(8.11) 


False  measurements  (class  C0)  are  uniformly  distributed  and  are 
independent  of  x  : 
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where  V  is  the  volume  of  the  sensor  surveillance  region.  The 
probability  of  receiving  false  measurements  is  given  by  a 

Poisson  distribution: 


g2 


-pv 

e 


m; 


(pv) 


■My. 


for  m^  >,  0 


As  described  in  section  2.3.2,  the  hypothesis  41.  ,  for  j  ^  0  , 
that  m^  =  1  and  =  m-1  .  So  from  equation  (8.8),  for  j  / 


1  J (m  -  1 )  ! 
m! 


g1(l)g2(In 


1) 


=  -  gl(1)g2(m  -  1) 


If  j  =  0  ,  then  mj  =  0  and  m^  =  m  ,  so  from  equation  (8.8): 


PrK}  =  gi(0)g2(tn) 

=  g1(°)g2(m)  • 


(8.12) 


(8.13) 

indicates 
0  : 


(8.14) 


(8.15) 


By  substituting  equations  (8.10)  to  (8.15)  into  the  general  solution 
given  above,  the  solution  of  the  baseline  problem  given  in  Chapter  2 
may  be  obtained. 
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8.3  The  sector  scan  problem  with  intermittent  interference 

8.3.1  Problem  statement 

In  this  extension  of  the  sector  scan  problem  (see  section  6.2) 
interfering  measurements  may  occur  behind  the  target,  when  viewed  from 
the  sensor  position  at  the  origin.  If  the  target  position  is  (r,  6), 
then  interfering  measurements  may  occur  in  the  region  (see  Fig  8.1): 


r  <  range  <  r  +  r 


> 


6  -  0  <  bearing  <  6+0 


(8.16) 


These  measurements  are  uniformly  distributed  in  polar  co-ordinates  at  a 
density  of  p  km  1  radians  1  ,  however  they  only  occur  within  the 
surveillance  sector.  The  switching  on  and  off  of  the  interference  is  a 
Markov  process.  Thus  if  the  interference  were  present  at  time  step  k  , 
the  probability  that  it  would  be  present  at  time  step  k  +  1  is  p  , 

and  the  probability  that  it  would  not  be  present  is  p  g  =  1  -  p  ^  . 
Likewise  the  probability  that  there  is  no  interference  at  time  step 
k  +  1  given  there  is  none  at  time  step  k  is  png  ,  while  the 
probability  of  a  transition  from  off  to  on  is  p  . 

In  this  example  it  is  assumed  that  the  parameters  r  ,  9  ,  p^,  p^ 
and  Pqq  are  all  known  and  that  interference  is  not  present  as  the 
target  enters  the  surveillance  sector.  Only  one  sensor  is  present  at 
the  origin  and  no  classification  information  is  available  to  distinguish 
between  the  true  measurement,  the  interfering  measurements  and  the 


usual  false  measurements. 
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The  problem  of  estimating  the  state  of  abruptly  changing  systems 

42 

has  received  considerable  attention  (see,  for  example,  Tugnait  “, 

43  .  45  52 

Tugnait  and  Haddad  ,  Weiss  et  al  and  Bolm  ~) .  As  noted  in 
section  6.4.3,  abrupt  target  manoeuvres  have  been  represented  by 
allowing  the  equations  of  target  motion  to  switch  between  different 
models.  In  this  interference  switching  problem  the  target  model  is 
fixed,  but  the  measurement  environment  may  change  suddenly  according 
to  the  switching  probabilities  p^Q  and  p  .  It  is  quite  straight¬ 
forward  to  incorporate  this  possible  switching  within  the  usual 
Bayesian  framework,  and  this  part  of  the  solution  (section  8.3.2. 1)  is 
similar  to  the  development  in  the  above  references.  However  the  updating 
of  probabilities  and  pdfs  on  the  assumption  that  interference  is  present 
is  a  new  problem.  We  shall  introduce  approximations  which  allow  a 
practical  sub-optimal  filter  to  be  derived  from  the  optimal  solution. 

8.3.2  Problem  solution 


Representation  of  intermittenc\ 


For  this  problem  we  have  the  measurement  -  class  association 
hypothesis  r  to  consider  as  described  in  section  8.2,  but  in  addition 
there  is  the  uncertainty  of  whether  or  not  interference  is  present.  We 
introduce  a  variable  y  which  takes  the  value  1  if  interference  is 
present  and  0  if  it  is  not.  This  indicator  only  applies  to  the  current 
set  of  measurements,  previous  hypotheses  on  the  presence  of  interference 
being  included  in  .  Equation  (8.2)  which  gives  the  posterior  pdf 

of  x  should  be  extended  to: 
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where  .  From  Bayes  rule,  the  corresponding  version  of 

equation  (8.6)  is: 


pCzlY.^y)  Pr{y|,Y,^’}  Pr j y  1  1  Prigj 

p(Z) 


(8.18) 


In  this  expression  the  factor  Prjy 

probabilities  p..  .  So  for  instance,  if 
ij 

present  at  the  previous  time  step  and  if 


is  given  by  the  switching 
under  interference  were 

y  =  0  ,  then: 


Pr 


Pl0 


The  other  factors  in  expressions  (8.17)  and  (8.18)  are  given  by  other 
equations  in  section  8.2. 

8 . 3 . 2 . 2  The  likelihood  of  a  set  of  measurements:  p(zjx,y>T) 

A  key  step  in  the  solution  of  this  problem  is  the  e' aluation  of  the 
likelihood  p(Z]x,y,'i')  .  The  hypothesis  4*  assigns  each  member  of  Z 
to  one  of  three  classes.  We  define  class  1  to  be  true  measurements, 
class  2  to  be  the  interfering  measurements  and  class  3  to  be  the  usual 
false  measurements.  If  y  =  0  ,  none  of  the  measurements  belong  to 
class  2.  For  classes  1  and  3,  we  have  as  usual: 

=  ’(z;h(x) ,R) 

and 

p(z!x,c3)  =  V_1  , 


where 


V  is  the  volume  of  the  surveillance  region.  For  class 


22Q 


p(z|x,C2)  = 

*  4»  -  (s  -  si ))  -  »(9„  -  (°  *  9i))]/2vi  ■ 

where  H(.)  is  the  Heavyside  function, 

(r,0)  is  the  target  position  in  polar  co-ordinates,  and: 

/  rm  \ 

z  = 

\®  j 

The  pdf  takes  this  form  because  class  2  measurements  cannot  lie  outside 
the  interference  region  (which  is  assumed  to  be  within  the  surveillance 
region  for  this  expression) . 

Suppose  that  under  hypothesis  ¥  for  the  measurements  received  at 
a  particular  time  step,  measurement  t  belongs  to  class  1,  measure¬ 

ments  with  subscripts  from  the  set  belong  to  class  2  and  the  m^ 
remaining  measurement  belong  to  class  3.  In  this  case  the  likelihood 
of  the  recieved  measurements  Z  is  given  by  (from  equation  (8.4)): 

p(Zlx.Y  Y) 

If  none  of  the  measurements  is  true,  the  factor  J/  (zt;h(x),R)  is 
omitted.  By  considering  the  factors  in  the  product  of  the  class  2  terms  it 


“(■»- r)  -H(r„  -  (r  *  ri)) 


it  can  be  seen  that: 


o  o 


(8.19) 


where  and  are  the  maximum  and  minimum  range 

measurements  in  class  2,  and  8,„,  and  0,„,  are  the  maximum  and 
minimum  bearing  measurements  in  class  2.  Note  that  expression  (8.19) 
is  sensible  because  if  two  measurements  allocated  to  class  2  by  ¥  are 
separated  in  range  by  more  than  r^  or  in  bearing  by  more  than  29 ^  , 
the  hypothesis  must  be  false.  If  this  is  not  so,  the  extreme  class  2 
measurements  restrict  the  possible  target  position  under  4'  to  the 
rectangle  in  r,0  space  shown  in  Fig  8.2.  This  is  equivalent  to  a 
region  A  in  x,y  space,  and  for  convenience  we  shall  define  the 
function : 


0, 


UA(x}  =< 


if  m2  >  0  and  j  x2  +  y2  >  rMN 

or  x2  +  y2  <  (rm  -  r^2  or  tan"1  (y/x)  >  9^  + 

or  tan  '(y/x)  <  6^  “  0j  } 


0,  if  A  does  not  exist  but  m’  >  0 

1,  otherwise  (including  the  case  m),  =  0)  . 
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Thus  Che  required  likelihood  of  the  measurements  Z  is  given  by: 


p(Z'x,y,y) 


U 


if  mj 


0 


(8.20) 


8. 3. 2. 3  First  approximation:  the  prior  pdf  p(x|j^')  is 
Gaussian 

Having  found  this  likelihood  we  may  proceed  with  the  solution  via 
equations  (8.3)  and  (8.18).  However  to  arrive  at  a  practical  filter 
it  will  be  necessary  to  make  a  number  of  simplifying  assumptions  so 
that  the  resulting  filter  is  sub-optimal.  The  first  approximation  is 
that  the  prior  pdf  p(x|^')  in  equation  (8.3)  is  Gaussian: 

p(xL%*')  =  * 


where  i  refers  to  the  hypothesis  ^ .  As  will  be  seen  from 
equation  (8.22)  below,  this  is  incorrect,  but  it  allows  us  to  write: 


*  p(5tl*.«r’)  p<x|jr> 

=  (zt;h(x)  ,r)  (  (xjx-^Jl) 

*  ,Pit)-v  (-t ‘-(ii),Si)  • 


(8.21) 


where  we  have  made  use  of  the  extended  Kalman  filter  approximation 
(see  section  6.2).  Thus,  using  the  above  result  with  (8.20) 
in  equation  (8.3),  we  obtain: 


223 


p(x|y,^”  ,z)  =< 


UA(-\y~(-’-i,Mi)/F2 


if  mj  =  1 


if  mj  =  0  , 


(8.22) 


where  f ,P^tJ dx  and  is  similar.  The  function 

UA(x)  effectively  truncates  the  Gaussian  in  (8.22),  so  that  the 

uncertainty  in  the  value  of  x  is  reduced  by  the  information  from  the 

class  2  measurements.  Unfortunately  the  integrals  F1  and  F^  of  the 

Gaussian  over  the  region  A  cannot  be  evaluated  analytically. 

However  if  (x^,  were  well  inside  A  and  the  corresponding 

standard  deviations  from  P^t  were  small  compared  with  the  dimensions 

of  A  ,  the  effect  of  U  (x)  in  equation  (8.22)  could  be  ignored. 

A 


Now  consider  the  posterior  probability  of  hypothesis  (y.j^'), 
given  by  equation  (8.18).  The  probability  of  receiving  m^  interfering 
class  2  measurements  is  given  by  a  Poisson  distribution  with  mean 
,  where  =  20Tr^.  is  the  volume  of  the  interference  region. 
Similarly  the  probability  of  receiving  m^  false  measurements  is  given 
by  a  Poisson  distribution  with  mean  pV  ,  where  V  is  the  volume  of  the 
surveillance  region.  Thus  from  equation  (8.8): 
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Also  using  (8.20)  and  (8.21)  with  equation  (8.7),  we  have: 


P(z|y,jr) 


-m'  -ml  r 

v  VI  y  (-t  :-(-i)  ,Si) J  ,F 


if  m! 


_m  ?  r  t  -  \ 

V  VI  I UA^-^-'^\-;-i ,Mi)  -- 


Inserting  (8.23)  and  (8.24)  into  (8.18)  we  obtain: 


,  e"pV  p  Pr^y|^  Pr 

f  -p  V  ml  r 

e  1  p  P  /  U.  (x)/Yx;a.^,P.  Jdx,  if  y 


Pr{  ,  ,.vT  Z 
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Pr 


{y.JT  |z} 


=  <j  0  _  PD)  e  pv  p  3  Pr{>! pr{^} 

-pIVl  m'  t 

2  PI  if 


Y  =  1 


>/E 


,  if  Y  =  0 


V. 


if  mj  =  0 


(8.25) 


where  E  is  the  normalizing  denominator  which  is  chosen  so  that: 


£  £  H^'12}  - 


Y=0  All 

jr 


Thus  in  principle,  the  posterior  pdf  of  x  may  be  obtained  by 
substituting  (8.22)  and  (8.25)  into  (8.17)  and  summing  over  all 
feasible  hypotheses.  The  main  difficulties  here  are  that  an  integral 
of  the  form: 


/uA^>r(x;«it,Pit)dx 

must  be  evaluated  for  every  hypothesis  with  y  =  1  and  that  there  are 
a  very  large  number  of  feasible  hypotheses.  In  fact,  if  m  measurements 
are  received  and  if  y  =  1  ,  the  number  of  feasible  hypotheses  T  con¬ 
cerning  the  class  membership  of  measurements  is  (2  +  m)2m  ^  .  This 
figure  is  very  large  even  for  modest  values  of  m  :  for  m  =  20  there 
are  over  10^  measurement  association  hypotheses.  So,  to  derive  a 
practical  filter  further  simplifying  approximations  must  be  introduced. 
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8. 3. 2. 4  Further  approximations  to  derive  a  practical  filter 

Firstly  we  shall  ignore  the  contribution  of  the  class  2  measure¬ 
ments  in  the  expression  for  p(x|y,^' ,Z)  .  Thus  equation  (8.22) 
becomes : 

if  m.'  =  1 

if  m|  =  0 

.  (8.26) 

(This  is  the  same  sort  of  approximation  as  made  in  the  deviation  of  the 
Auxiliary  Sensor  filter  (see  section  7.4.2).)  Thus  information  from  the 
class  2  measurements  is  only  taken  into  account  via  the  probabilities 
|zf  •  Clearly  some  potentially  valuable  information  is  being 
discarded  here,  however  it  does  allow  a  useful  simplification  of 
equation  (8.17)  (and  it  ensures  that  the  prior  pdf  p(x!^”)  is 
Gaussian) .  Let  us  write  the  measurement  association  hypothesis  Y  as 
the  pair: 

Y  =  (ft ,A)  , 

where  ft  indicates  the  choice  of  true  measurement  (class  1)  and  A 
specifies  the  partition  of  the  remaining  measurements  between 
classes  2  and  3.  From  (8.26)  it  can  be  seen  that  p(x!y,JT'  ,Z)  is 
independent  of  A  ,  so  that  equation  (8.17)  may  be  written: 
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1 

p(x|z)  m  p  (x  j  y  ,S1 ,  ,  Z)  Pr  jy ,Q ,9iT\  z| 

_ _ A  *  i  i  iii 


(8.27) 


Y=0  All  All 

MT  n 


where  p(x  |  Y,fi,^, Z)  is  given  by  equation  (8.26) 
and 


Prjy.fi.^lz}  =  ^  Pr|y,fi,A,<^’| z|  .  (8.28) 

All 

A 


The  usual  acceptance  test  may  be  employed  to  make  a  short  list  of  Q 
hypotheses  for  each  hypothesis  . 

We  now  introduce  the  last  approximation  which  allows  us  to  perform 
the  summation  over  all  A  in  equation  (8.28).  It  is  assumed,  only  for 
the  integrals  in  equation  (8.25),  that: 


and 


,r(£i8it.Pit) 

- >Mi) 


(8.29) 


Thus  the  uncertainty  in  the  value  of  x  represented  by  P^t  or 
under  the  hypothesis  (fi is  ignored.  It  is  recognized  that  this 
contradicts  the  first  assumption  given  by  equation  (8.26),  and  although 
this  is  unsatisfactory,  it  does  enable  a  practical  filter  to  be  derived. 

Consider  the  probability  Pr  { y  ,12  ,A  Z  (  given  by  equation  (8.25) 
in  the  light  of  assumption  (8.29).  With  this  assumption,  for  y  =  1  , 
the  interference  region  is  known  precisely  under  hypothesis  (C.j^O. 
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Thus  if  all  of  the  measurements  associated  with  class  2  by  hypothesis 
A  are  in  this  region,  then  the  integral  in  (8.25)  is  unity. 

Otherwise  the  integral  is  zero,  so  that  the  probability  of  this 
hypothesis  is  zero.  So,  from  (8.25): 


r  — oV  ”3  ■°IV’  ^ 

e  p  e 


in  * 

1  PT2  Pr{Y|Jf}  Pr | 


V 


■( 


1  "  P„ 


if  mj  =  1 


if  mj  =  0 


r  /  e 


if  y  *  1  and  if  under  A  ,  all  of  the 

class  2  measurements  are  within  the  interference 
region  defined  by  ft  . 


Pr|y,n,A,^riz}  «< 


-pV 


P  3  Pr{Y|^}  Prjjr} 


PD-V 


(-t’-(^i) ,Si) ’ 


*< 


1  ~  P, 


if  mj  =  1 


if  mj  =  0 


r/E 


if  Y  =  0  . 

0,  otherwise. 


(8.30) 
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Now  if  y  =  0  ,  then  all  measurements  apart  from  the  true 
measurement  belong  to  class  3 ,  ie  if  y  =  0  ,  for  each  there  is 

only  one  hypothesis  A  .  So  if  y  =  0  ,  there  is  only  one  term  in  the 
summation  of  equation  (8.28)  and  Pr  {y  ,f2 \ Z  }  is  given  directly  by 
equation  (8.30).  However  if  y  =  1  ,  then  for  each  the  number 

of  feasible  hypotheses  A  is  equal  to  the  number  of  ways  of  partitioning 
the  measurements  in  the  interference  region  between  classes  2  and  3. 
Suppose  that  with  y  =  1  ,  under  hypothesis  (51,^0,  m^.  measurements 

fall  within  the  interference  region.  If  mi  of  these  measurements  belong 

/M 

to  class  2,  then  there  are  exactly  l  m’  I  waYs  °f  partitioning  the  m^. 
measurements  between  classes  2  and  3.  Since  the  measurements  outside 
the  interference  region  all  belong  to  class  3  (excluding  the  true 
measurement),  there  are  exactly  (  mT  [  feasible  hypotheses  A  for 


(3) 


which  m^  measurements  belong  to  class  2.  The  probability  of  each  of 
these  hypotheses  is  the  same  and  it  is  given  by  (8.30).  Also 
since  m^  may  take  any  value  between  0  and  m^.  ,  for  y  =  1  the 
summation  (8.28)  may  be  evaluated  using  equation  (8.30)  and  the 
identity : 


Thus  the  probability  Pr { y  Z }  may  be  written  (absorbing  some 

common  factors  into  the  normalizing  denominator) : 
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where  the  normalizing  denominator  E  is  chosen  so  that: 


1 

^  pr{y,n,jr|z}  =  1 

All 

n  JT 


By  applying  these  approximations,  the  number  of  hypotheses  that 


must  be  explicitly  considered  has  been  reduced  to  a  feasible  number. 


ll 

Y=0  All 


provided  that  the  usual  acceptance  test  and  mixture  reduction  algorithm 
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are  employed.  The  posterior  probabilities  for  the  feasible  hypotheses 
are  simple  modifications  to  those  of  the  baseline  problem  given  by 
equation  (2.18).  To  evaluate  the  main  modifying  factor  for  y  =  1  , 
it  is  only  necessary  to  count  the  number  m^  of  measurements  falling 
within  the  interference  region  defined  by  the  hypothesis  ;  the 

awkward  integrals  of  expression  (8.25)  are  avoided.  Due  to  the 
incorrect  assumption  (8.29)  that  the  state  vector  is  perfectly  known 
under  (fi,^7),  this  modifying  factor  may  be  overselective ,  so 
occass ionally  an  undue  weighting  is  given  to  the  wrong  component.  To 
compensate  for  this,  in  the  evaluation  of  m  a  heuristic  adjustment 
has  been  made  to  the  boundaries  of  the  interference  region  as  defined 
by  .  Each  azimuth  boundary  has  been  increased  by  one  standard 

deviation  of  the  true  measurement  bearing  error  aQ  ,  and  each  range 
boundary  has  been  increased  by  (see  Fig  8.3).  This  has  the  effect 

of  'softening'  the  selectivity  of  the  modifying  factor  (1  +  p^/p)™1 
in  (8.31).  Further  details  of  the  filter  implementat ‘ 'n  are 
described  in  the  following  section. 

8.3.3  Implementation  of  the  filter 

The  implementation  of  the  tracking  filter  derived  in  the  previous 
sections  is  based  on  equations  (8.26),  (8.27)  and  (8.31).  The 
formation  and  control  of  hypotheses  is  shown  schematically  in  Fig  8.4. 
Each  hypothesis  from  the  previous  time  step  is  predicted  forwards 

and  the  usual  acceptance  test  is  applied  to  identify  a  set  of  probable 
true  measurements  for  each  ,  Together  with  the  possibility  that 

the  true  measurement  has  been  missed,  these  sets  make  up  the  il 
hypotheses.  For  each  (fi,jfO  hypothesis,  the  posterior  pdf  of  x  is 
evaluated  from  equation  (8.26)  (from  our  approximation  this  is 
independent  of  y  )  .  Each  (Q,.^-)  hypothesis  is  then  split  to  allow 
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for  the  possibilities  of  interference  absent  or  present  (y  =0  or  1), 
and  the  posterior  probability  of  each  (y,fl,j?0  hypothesis  is  calculated 
from  equation  (8.31). 


The  mixture  components  and  probability  weights  of  the  posterior 
pdf  of  x  for  the  current  time  step  are  now  available  (see  equation 
(8.27)).  The  required  estimate  may  now  be  extracted,  the  usual 
minimum  mean  square  estimate  being  given  by: 


r 


-  =  I 1  E  . 


Sit’ 


All  All  \ y=0 


Si’ 


if 


if 


m!  =  1 


m. 


> 


(8.32) 


where  x.  and  x.  are  the  means  of  the  mixture  components  (see 

-it  -l 

equation  (8.26)),  subscript  i  corresponds  to  ,  and  t  is  the 

choice  of  true  measurement  defined  by  fi  .  Also  the  probability  that 
interference  is  present,  based  on  the  filter's  processing  of  the 
received  measurements,  is  given  by: 


All  All 

n 


(8.33) 


For  implementation  using  the  Clustering  Algorithm,  before  mixture 
reduction  the  hypotheses  are  divided  into  two  groups  for  y  =  0  and 
y  =  1  .  The  Clustering  Algorithm  is  then  applied  separately  to  the 
mixture  distribution  corresponding  to  each  group.  This  ensures  that 
even  after  reduction,  each  mixture  component  is  associated  with 
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y  =  0  or  y  =  1  •  The  reduced  mixture  can  the.'  be  predicted  forwards 
in  the  usual  way,  ready  for  the  next  set  of  measurements. 

An  implementation  using  a  PDAF  type  of  approximation  is  shown 
schematically  in  Fig  8.5.  This  is  slightly  different  from  the  usual 
PDAF  philosophy  in, that  two  mixture  components  are  allowed  to  survive 
at  each  time  step.  These  two  components  correspond  to  interference 
present  or  interference  absent.  When  a  set  of  measurements  is  received, 
each  of  these  components  is  split  according  to  y  =  1,0  ,  and  the  PDAF 
is  applied  to  each  branch.  Thus  four  branches  are  created  with 
probability  weights  6qQ,  6q1  ,  8  ^  and  6^  (see  Fig  8.5).  The  two 
Y  =  0  branches  and  the  two  y  =  1  branches  are  then  merged  separately 
to  form  a  two  component  mixture  distribution  with  probability  weights: 


and 


=  B01  +  6 


1 1 


00 


10 


These  components  are  predicted  forwards  to  the  next  time  step. 

Note  that  the  standard  PDAF  avoids  the  calculation  of  the  mean 
of  each  mixture  component  before  reduction.  However  for  this  problem, 
as  for  the  Auxiliary  Sensor  filter  (section  7.4.2),  to  evaluate  the 
required  probability  weights,  the  means  must  be  available.  They  are 
required  to  identify  the  interference  region  so  that  m^  can  be  found 
(see  (8.31)).  Thus  much  of  the  efficiency  of  the  standard 
PDAF  is  lost  in  this  implemenation. 
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8.3.4  Simulation  example 


Target  trajectories  and  measurements  have  been  simulated  for  the 
problem  defined  in  section  8.3.1.  The  standard  sector  scan  parame'.ers 
shown  in  Table  6.1  have  been  used  with  the  following  interference 
parameters : 


Interference 

region 


5  km 


0.04  radians 


r 


p 


ii 


Switching  J 
probabilities  j 


p00 


0.9,  P 


10 


0.1 


0.9,  p 


01 


0. 1 


One  hundred  replications  of  trajectories  and  measurements  have  been 
generated  for  each  of  the  following  values  of  interfering  measurement 
density  p^:  10,  20,  40,  100,  200  and  400  km  ^ad  1  .  (Note  that  the 
density  of  the  usual  false  measurements  is  p  =  10  km  ^rad  .)  The 
standard  sector  scan  filter,  which  assumes  there  is  no  interference, 
and  the  Interference  filter  described  in  the  previous  section  have  both 
been  applied  to  the  simulated  data.  In  each  case  results  have  been 
obtained  for  both  PDAF  and  Clustering  Algorithm  reduction  techniques. 
The  percentage  of  maintained  tracks  for  each  of  these  filters  is  shown 
in  Fig  8.6  as  a  function  of  p^.  . 

The  introduction  of  intermittent  interference  with  p^.  =  10  km  ' 
rad  1  has  negligible  effect  on  the  performance  of  the  standard  CAF 
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and  PDAF.  Also  the  performance  of  the  Interference  CAP  is  very  similar 
to  that  of  the  standard  CAP,  and  likewise  the  performance  of  both  PDAFs 
is  similar.  (For  this  low  level  of  p^  ,  with  interference  switched  on, 
the  average  number  of  interfering  measurements  generated  per  scan  is 
only  four.)  With  increasing  p  ,  the  percentage  of  maintained  tracks 
for  the  standard  filters  tends  to  decrease,  as  would  be  expected. 

However  the  performance  of  the  Interference  PDAF  improves  with  p 
and  tends  towards  the  performance  of  the  Interference  CAF.  This 
improvement  is  because  the  Interference  PDAF  is  making  use  of  inform¬ 
ation  from  the  interfering  measurements.  As  p  increases,  more 
measurements  fill  out  the  interference  region,  so  that  the  boundaries 
of  the  region  become  more  well  defined  (see  Fig  8.2).  Thus  the 
probability  weight  for  the  correct  fi  hypothesis  is  more  strongly 
reinforced  as  p^.  increases.  The  percentage  of  tracks  held  by  the 
Interference  CAF  remains  roughly  constant  at  about  95%  as  p 
increases;  by  modelling  the  intermittent  interference,  the  performance 
degradation  of  the  standard  CAF  is  avoided. 

As  expected,  the  average  number  of  mixture  components  generated 
increases  with  p  for  both  standard  and  Interference  filters  (see 

Fig  8.6).  For  p  40  km  ^rad  '  ,  the  standard  filters  generate  less 
components  than  the  Interference  filters,  while  for  p^  >,  100  km  ^ rad 
the  standard  filters  generate  more  components.  This  may  be  explained 
as  follows.  When  p  is  small,  the  standard  filters  are  only  likely 
to  encounter  a  few  interfering  measurements,  especially  if  track  is 
maintained.  If  the  Interference  filters  encounter  a  similar  measurement 
density,  they  will  generate  more  components  since  allowance  is  made  for 
the  possibilities  y  =  0  and  y  =  1  .  When  p  is  large,  even  if  the 
standard  filters  maintain  track,  they  are  likely  to  have  been  attracted 


into  an  interference  region  (and  so  encountered  a  high  density  of 
measurements)  during  their  traversal  of  the  sector.  However  the 
Interference  filters  recognize  that  interference  may  be  present,  and 
make  use  of  their  knowledge  of  the  distribution  of  these  measurements 
relative  to  the  target  position,  to  lead  the  track  beside  the  inter¬ 
ference  region.  Thus  these  filters  avoid  regions  of  high  measurement 
density  and  so  for  large  p  ,  on  average  they  generate  less  components 
than  the  standard  filters,  even  though  the  possibilities  y  =  0 
and  y  =  1  are  included. 

The  variation  with  p^  in  the  number  of  components  generated  by 
the  CAF  is  reflected  in  the  average  cpu  time  to  perform  a  single 
iteration  (see  Table  8.1).  However  the  processing  time  for  the  standard 
PDAF  is  always  less  than  the  Interference  PDAF,  which  requires  explicit 
calculation  of  the  mean  of  each  mixture  component  (see  section  8.3.3). 
The  error  statistic  E  is  also  given  in  Table  8.1.  This  shows  that 
even  for  held  tracks,  both  the  standard  and  the  Interference  filters 
tend  to  underestimate  their  tracking  error,  particularly  for  large 
values  of  .  This  is  probably  due  to  the  rather  sweeping 

approximations  made  in  the  deviation  of  the  Interference  filters  and 
the  omission  of  any  interference  model  for  the  standard  filters.  The 
error  underestimate  is  worst  for  the  standard  PDAF.  As  for  the 
standard  sector  scan  problem,  for  lost  tracks  the  filters  often 
seriously  underestimate  the  tracking  error. 

Fig  8.7  shows  the  filter's  achieved  mean  square  position  error 
over  the  first  forty  time  steps  for  p^  =  40  and  400  km  'rad  '.  For 
maintained  tracks,  the  accuracy  of  the  Interference  filters  is  superior 
to  the  standard  filters.  The  improvement  is  most  evident  for 
p  =  400  km  'rad  '  .  For  the  standard  filter,  tracking  error  for  the 
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held  tracks  increases  with  p^  ,  while  for  the  Interference  filters  the 
CAF  is  little  changed  and  the  PDAF  shows  some  improvement. 

Finally,  Figs  8.8  to  8.10  show  three  examples  of  target  tracking 
for  pj  =  10,  40  and  400  km  ^rad  ^  .  Each  of  these  figures  shows 
trajectory  estimates  produced  by  the  CAF  and  the  PDAF  for  both  the 
standard  filter  and  the  Interference  filter.  The  points  at  which 
interference  switches  on  and  off  are  indicated  on  the  actual  target 
paths.  Also  for  each  example  a  plot  of  interference  switching  against 
time  is  presented.  This  may  be  compared  with  the  Interference  filters' 
internal  assessment  of  the  probability  P^  that  interference  is 
present  (obtained  by  summing  over  the  appropriate  mixture  weights  - 
see  equation  (8.33)).  For  =  400  km  ^rad  '  ,  the  plots  of  P  are 
essentially  identical  to  the  actual  switching  waveform,  showing  that 
the  filters  are  very  certain  as  to  the  presence  or  absence  of  inter¬ 
ference.  With  this  high  density  it  is  easy  for  the  filters  to  detect 
the  large  number  of  extra  measurements  behind  the  target  when  inter¬ 
ference  is  present.  (A  sample  plot  of  the  measurements  received  on  a 
single  scan  for  =  400  km  Vad  ^  is  shown  in  Fig  8.11.)  As  p^ 
is  reduced  the  presence  or  absence  of  interference  becomes  more 
difficult  to  detect,  and  for  the  sparse  interference  p^.  =  1 0  km  ^ rad 
the  traces  of  P^.  are  quite  different  from  the  actual  switching 
signal  (see  Fig  8.8).  At  the  higher  densities  of  =  40  and 
400  km  ^rad  '  ,  the  effect  on  the  standard  filter  tracks  of  interference 
appearing  behind  the  target  is  obvious,  and  the  value  of  modelling  the 
interference  is  clearly  demonstrated. 

8.4  Conclusions 

It  is  fairly  straightforward  to  derive  the  formal  Bayesian 


solution  to  the  extension  of  the  baseline  problem  to  multiple 
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measurement  classes.  However  the  interference  example  shows  that  very 
complex  filters  may  result  when  this  general  solution  is  applied  to 
specific  problems.  By  making  several  approximations  a  practical 
filter  has  been  derived  for  the  interference  problem.  In  spite  of 
these  approximations,  simulations  show  the  performance  benefit  of 
modelling  the  intermittent  interference.  Especially  for  high  levels 
of  interference,  the  performance  of  the  multiple  measurement  class 
filter  is  clearly  superior  to  the  standard  filter  which  takes  no 
account  of  possible  interference. 


Table 
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Fig  8.1  The  interference  region 
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Fig  8.4  Implementation  of  the  interference  filter  using  the  Clustering  Algorithm 
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Fig  8.11  Sample  plot  of  all  measurements  received  on  a  single  scan 
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9  CONCLUSIONS  AND  FURTHER  WORK 


In  this  thesis  we  have  shown  how  Bayesian  techniques  may  be 
applied  to  tracking  problems  where  the  origin  of  the  measurement  is 
uncertain.  A  mixture  reduction  technique  has  been  developed  to  con¬ 
tain  the  ever  growing  computational  requirements  of  the  optimal 
Bayesian  filter.  The  performance  of  this  Clustering  Algorithm  has 
been  assessed  by  simulation  for  a  straightforward  baseline  tracking 
problem,  and  it  has  been  compared  with  the  PDAF  method.  Filters  have 
also  been  developed  for  extensions  of  the  baseline  case  including  data 
fusion  and  measurement  interference  problems. 

■  The  detailed  conclusions  and  discussions  for  this  study  are  given 
at  the  end  of  each  chapter.  Some  overall  observations  are  given  below: 

(i)  The  performance  of  the  CAP  is  always  better  than  or 
similar  to  that  of  the  PDAF.  This  improvement  is  at  the  expense  of 
of  increased  computational  memory  and  processing  requirements. 

The  processing  time  for  the  CAF  is  usually  within  an  order  of 
magnitude  of  the  PDAF  processing  time,  although  for  very  difficult 
cases,  where  performance  is  in  any  case  poor,  the  excess  may  be 
several  orders  of  magnitude. 

(ii)  Bayes  theorem  provides  a  convenient  recursive  mechanism 
for  incorporating  information  from  various  sources,  and  for  many 
interesting  tracking  problems  a  filter  based  on  the  optimal 
solution  may  be  derived.  However,  even  for  minor  extensions  of 
the  baseline  problem,  the  optimal  filter  may  be  very  complex  so 
that  a  number  of  significant  approximations  must  be  imposed  to 
obtain  a  practical  filter. 


251 

(iii)  Simulation  has  proved  to  be  a  useful  tool,  both  for 
performance  assessment  and  as  an  aid  to  understanding  the 
operation  of  the  filters. 

This  study  has  been  concerned  with  estimating  the  current  state 
of  a  single  target  based  on  past  measurements.  We  propose  to  extend 
this  work  to  include  trajectory  estimation  and  multiple  target  tracking. 

For  some  applications  it  is  necessary  to  estimate  the  past 
trajectory  of  a  target  as  well  as  its  current  position.  Each  new 
measurement  that  is  received  provides  information  on  the  past  values  of 
the  state  vector  via  the  target  model,  and  clearly  this  information 
should  be  used  for  trajectory  estimation.  A  filter  which  refines  past 
estimates  in  the  light  of  subsequent  measurements  is  called  a  smoothing 
filter.  In  terms  of  the  pdf  of  target  state,  for  a  smoothing  filter 
we  require: 

p(*kiZ1 . Zn)  * 

where  k  ^  n  .  For  standard  filtering  problems  without  measurement 

uncertainty,  efficient  optimal  smoothing  algorithms  have  been  derived 
27 

(see  Jazwinski  ).  For  trajectory  estimation,  it  has  been  shown  that 
these  filters  can  provide  an  impressive  improvement  over  the  standard 
Kalman  filter  (see  Refs  53  and  54) .  The  smoothing  problem  for  uncertain 
measurement  association  is  more  complex.  Mahalanabis  and  Zhou^"’  have 
suggested  smoothing  back  one  or  two  time  steps  to  improve  a  PDAF 
estimate.  Also  we  have  obtained  some  encouraging  preliminary  results 
for  full  trajectory  estimation  using  a  PDAF  based  smoothing  algorithm. 

We  hope  to  extend  this  study  to  investigate  the  merrits  of  retaining 
more  than  one  component  for  the  smoothing  operation. 
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The  problem  of  tracking  multiple  targets  is  more  complicated  than 
the  single  target  case.  This  is  due  to  the  range  of  extra  measurement 


association  hypotheses  that  must  be  taken  into  account.  The  coarse 

acceptance  test  is  most  valuable  here  in  eliminating  improbable 

associations  between  measurements  and  remote  tracks.  Blackman^ 

presents  a  branching  algorithm  for  generating  the  appropriate  hypotheses 

23  26 

which  is  based  on  techniques  developed  by  Reid  and  Mori  et  at  .  As 
for  the  single  target  case,  the  number  of  feasible  hypotheses  grows 
rapidly,  and  we  intend  to  investigate  the  application  of  the  Clustering 
Algorithm  to  control  this  growth.  Also  we  propose  to  study  the  multiple 
target  data  fusion  problem. 
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Appendix  A 

THE  KALMAN  FILTER  RELATIONS 

A. 1  The  Kalman  filter  problem 

The  Kalman  filter  problem  is  similar  to  the  problem  statement  of 
section  2.2  of  the  main  text,  except  that  only  a  single  true  measure¬ 
ment  is  available  at  each  time  step.  A  simple  form  of  the  Kalman 
filter  problem  is  stated  below. 

The  state  vector  x  is  assumed  to  obey  a  linear  system  model: 

5k+1  =  +  rwk  ,  (A-i 


where 

*k 

is 

the 

n-dimensional  state  vector  at  time  t,  , 

k 

is 

the 

n  x  n  state  transition  matrix, 

r 

is 

the 

n  x  r  distribution  matrix. 

and 

2k 

is 

the 

r-dimensional  system  driving  noise  which 

Gaussian  distribution  with  zero  mean  and  covariance  given  by: 


Here  Q  is  a  positive  definite  r  x  r  matrix  and  5  is  the 

Kronecker  delta.  At  each  time  step  t,  ,  a  u-dimensional  measurement 

k 

vector  z^  is  available,  which  is  linearly  related  to  the  state  vector: 

5k  '  H5k  +  \  ■  (A'2) 

where  H  is  the  u  x  n  measurement  matrix 

and  v  is  the  u-dimensional  measurement  noise  which  has  a  Gaussian 
-k 

distribution  with  zero  mean  and  covariance  given  by: 
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4]  *  R5tk  ■ 

Here  R  is  a  positive  definite  u  *  u  matrix  and  <5^  is  the 
Kronecker  delta.  Also  it  is  assumed  that  initially  at  time  t^  ,  the 
state  vector  x^  is  known  to  have  a  Gaussian  distribution  with  mean 
x^  and  covariance  .  In  a  more  general  formulation,  the  covariance 

matrices  and  system  matrices  may  depend  on  k  .  However  the  resulting 
filter  is  similar  and  so  for  simplicity  of  notation,  this  dependence 
is  not  included. 

Using  this  information,  the  problem  is  to  determine  the  pdf  of 
the  state  vector  at  each  time  step  t^  ,  conditional  on  all  the  measure 
ments  received  up  to  and  including  t^  .  From  this  pdf  an  optimal 
estimate  according  to  any  desired  criterion  may  be  obtained. 

Since  all  relationships  are  linear  and  all  distributions 
Gaussian,  the  required  pdf  of  the  state  vector  at  each  time  step  is 

also  Gaussian  (as  is  shown  in  what  follows) .  This  is  why  a  particularl 

neat  and  elegant  recursive  solution  may  be  obtained.  The  Kalman  filter 
recursion  at  each  time  step  is  essentially  a  two  stage  process.  In 
the  first  stage,  the  prior  pdf  at  t^  is  updated  with  the  measurement 
to  obtain  the  required  posterior  pdf.  In  the  second  stage,  this 

posterior  pdf  is  predicted  forwards  to  obtain  the  prior  pdf  for  the 

following  time  step  t^+1  .  The  recursions  for  these  two  stages  will 
be  obtained  using  Bayesian  techniques  in'  the  following  two  sections. 
(Different  methods  and  optimization  criteria  which  also  lead  to  the 
Kalman  filter  relations  are  detailed  in  Refs  27  to  31.) 
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X(ik  j  ^  ,  mJ 

■  [‘2")n+“lRl  |\|_  «p|-  !(sk  -  “ik)1  R''(5k  '  “ifc) 

-  K^k  ■  5k)T  \\\  -  \)  j 

•  '  [«*>n*"Wj\j  exp|-  !(sk  '  *k)T  pk'(^t  ‘  4)  *  r'J  CA‘5) 

on  combining  the  quadratic  forms  (see  section  A. 4),  where 

F"1  =  +  HTR_1H  (A-6) 


So  the  posterior  pdf  of  x  is  Gaussian  with  mean  x^  and  covariance 
P  .  The  expression  for  P,  may  be  written  in  a  more  convenient  form 

K  K 

using  the  matrix  inversion  theorem  (see  Ref  27  page  262)  to  obtain, 
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A. 3  Prediction 

To  complete  the  recursion,  it  is  necessary  to  predict  forwards  from  the 
posterior  pdf  at  to  obtain  the  prior  pdf  at  time  t^+)  ,  ie 


Now,  by  definition, 

=  /  P(2W5kIVd2k 

=  /  P(2k+ll5k«Zk)p(25k|Zk)d2k  * 


(A- 13) 


P^+ll^k’V  =  p(5k+l'^k)  *  since  given  5k  ’ 


Z,  contributes  no  useful  infor- 
k 


mat ion,  and  from  (A-l) 


p^i1^  =^5kJ*VrQrT) 


(A-l A) 


Hence,  from  (A-9)  and  (A- 14) 


[(2Tr)2n|rQrT|  |pj  \  exp 


]  exP  j”  -  *xk)T(rqrT)"I(xk+i -$Xk} 

-  -  x/p-1^  -  x^)} 

|(27r)2n|rQrT|  [Pk|J  expj-  -  d)TD  1  (^  -  d) 

-  -  $xk)} 


(A- 15) 


on  rearranging  the  quadratic  forms  (using  the 
where  d  is  independent  of  x^  , 


- 1  T  T  - 1  - 1 

D  =  $  (TQr  )  <5>  +  P 

V,  =  'V"  +  ’ 


result  of  section  A. A), 


and 
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The  expression  ( A— 1 5 )  nury  be  integrated  with  respect  to  x  to  give 

/l>'5k*l>2l klzk)dSk 

-  |D|![(2«)nirQrT||p|]  ew|-  -  t^)}  . 

Also 

|D  |  ^  ( I  TQr1 1  |Pkl)"^  =  (|d-1'|  |TQrT|  |pkf^ 

=  ( I 1 1 |FQrT  +  $PkoT| |Pkl)-i 

=  l\l'* 

using  standard  identities  for  determinants  (see  Ref  36). 

Therefore, 

where  x^+j  =  $£ 

and  Mk+j  =  ^P^^  +  rQT 

These  expressions  for  the  mean  x^f  and  covariance  ,  complete  the  Kalman 

filter  recursions. 

A. 4  Combination  of  quadratic  forms 

Lemma  If  B  and  C  are  symmetric  and  positive  definite,  then 

(a  -  Ax) TB~ 1  (a  -  Ax)  +  (b  -  x)TC  1  (b  -  x)  =  (x  -  j;)TD  1  (x  -  ^)  +  r'  ,  (A-J7) 

T  - 1 

where  ^  =  b  +  DA  B  (a-Ab), 

D  1  =  ATB  'a  +  C  1 

and  r'  =  (a  -  Ab)^(B  +  ACA^")  '(a  -  Ab) 

Note  that  r'  does  not  depend  on  x  . 
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Proof  Consider  the  left-hand  side  of  (A- 17): 

(a  -  Ax)^B  '(a  -  Ax)  +  (b  -  x)TC  ! (b  -  x)  =  x^(A^B  'a  +  C  ')x 

TT-1  -1  T  - 1  T  — ! 

-  x  (A  B  a  +  C  b)  -  (a  B  A  +b  C  ) 

T  -  1  T  -  1 

+  a  B  a  +  bC  b 

T  —  1 

=  x  D  x 

-  xT(ATB_1  (a  -  Ab)  +D~'b) 

-  ((a  -Ab)TB_1A  +  bTD~')x 

T  -  1  T  -  1 

+  a  B  a  +  bC  b 

T  -1  T  -1  T  -1 
=  xD  x~21D  v-yD  x 

T  -1  ,  T  - 1 

+  aB  a+bC  b 

=  (x  -  ^)TD  ’ (x  -  y)  +  r’  , 

T  -  1  t  -  1  T  -  1 

where  r’=-yD  y+aB  a+bC  b 

T-l  T-l  T-l  TT-1 

=  -  (a  -  Ab)  B  ADA  B  (a  -  Ab)  -  a  B  Ab  +  b  A  B  Ab 

TT-1  TT-1  TT-1  -1  T-l  T-l 

-  b  A  B  a  +  b  A  B  Ab  -  b  (A  B  A  +  C  )b  +  a  B  a  +  b  C  b 

T  r  - 1  -1  T-l  -1-lT-ll 

=  (a  -  Ab)  [B  -  B  A(A  B  A  +  C  )  A  B  J  (a  -  Ab) 

and  from  the  matrix  inversion  lemma  (see  Ref  27,  p  262),  the  term  in  square 
brackets  is  equal  to 

(B  +  ACA1)”1 


which  completes  the  proof. 
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MEAN  AND  COVARIANCE  OF  A  MIXTURE  DISTRIBUTION 
AND  THE  PDAF  ALGORITHM 


Consider  any  mixture  distribution  with  pdf 


p(x) 


I 

=  1 


6. 

i 


Pi(x) 


where  p^(x)  is  a  component  pdf 

and  8^  is  a  probability  associated  with  the  ith  component 

such  that: 


8i  >  0 

and 


Also  let  the  mean  of  the  ith  component  be  iL  and  let  the  covariance 
of  the  ith  component  be  . 


The  mean  of  the  mixture  is  defined  by: 


x 


J xp(x)dx 


(B-1) 


The  covariance  of  the  mixture  is  defined  by: 
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But 


so 


P  = 


y*(x  -  ft)  (x  -  x)T  p(x)  dx 

f  t  T 

=  /xx  p(x)  dx  -  ft  ft 

N 

=  ®if-  dx  -  ft  ft^ 


i=1 


/T  T 

xx  p  .  (x)  dx  -  ft .  x .  , 

-  *i  -  -  -i  -l 

p  ■  Z  ei(pi  *  *1  $)  -  ss1  • 


(B-2) 


Another  form  for  this  covariance  may  be  obtained  by  observing  that: 


Z  Si  *1  -  S  ST  -  ZSi  -i  £  -  5  Z  6i  -I  *  Z  Bi  Si  5 

i=1 


T  A  „T 
+  x  x 


'  Z  Bi(Ji  '  "  -)T  • 

i  =  1 


IN  IN 

Therefore  P  =  8^  P.  +  ^  ®i(-i  ~  ^)(-i  ”  ’  ( B— 3 ) 

i=1  i=1 


B.2  For  the  PDAF,  the  posterior  mixture  distribution  is  approximated 
by  a  single  Gaussian  at  every  time  step.  The  Gaussian  approximation 
is  chosen  to  have  the  sane  mean  and  covariance  as  the  mixture.  The 
PDAF  is  an  efficient  algorithm  because  for  this  approximation,  explicit 
calculation  of  the  mean  and  covariance  of  each  individual  mixture 


component  may  be  avoided. 
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Due  to  the  single  Gaussian  approximation,  the  prior  pdf  at  any 
time  step  is  approximated  by: 

p  (x  \&)  =  Jr(x  ;  x  ,  M) 

After  update  by  a  set  2  of  m  measurements,  the  posterior  pdf  is: 


m 

p(x|Z)  =  ^  8£/(x  ;  xz  ,  P^)  ,  (B-4) 


where  x^ 


x  +  Ku£  if  £  0 


£  =  0 


^£  =  5*  -  H* 


P '  = 

r£ 


a  t  Q 


£  =  0  , 


where  P'  and  K  are  obtained  from  the  usual  Kalman  filter  update 
relations,  equation  (2.8).  For  the  PDAF  approximation  we  only  require 
the  mean  x  and  covariance  P  of  equation  (B-4).  From  equation  (B-1), 
the  mean  is  given  by: 


m 

*  ■  l 


=  x  +  Kv  , 


(B— 5) 


m 

-E 


where  v  =  > 


Therefore  we  have : 
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x  -  £ 

-l 


if  if  0 

if  1  =  0. 


Substituting  this  into  equation  (5-3)  gives  the  required  covariance 
P  of  (B-4): 


S0M 


m 

I 

£=1 


K 


T 


60M  + 


0- 


6o)p’ 


+  K 


m 

I 

i=  i 


.  T  T 


(B-6) 


Note  that  the  computational  effort  necessary  to  evaluate  equations  '3-5) 
and  (B-6)  is  modest  in  comparison  with  the  full  Bayesian  filter  (see 


Ref  11). 
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THE  JOINING  ALGORITHM  WITH  MEASURE  d. . 

_ _ _ LL 

IS  NOT  SUBJECT  TO  REVERSALS 


Suppose  that  at  some  stage  during  mixture  reduction,  the  closest 

components  according  to  the  distance  measure  given  by  (3.8)  of 

section  3.6.1  have  means  x  and  jr  and  weights  6  and  6  .  The 

distance  between  these  components  is  d  .  ,  where: 

r  mm 


d2.  =  f(s  ,  8  )|  |x  -  y | 

min  \  x  y /  -  - 


i  i  2  T  - 1 

where  j  |x  -  ^| |  =  (x  -  y) '  P  (x  -  £)  , 

and  f (S  ,6)  =86/(8  +  8  )  . 

x  y  x  y  x  y 

As  they  are  closest,  these  two  components  are  merged  to  produce 


a  new  component  with  mean: 


8  x  +  S  y 

x  -  y  d- 


8  +  S 

x  y 


and  weight  8  =8  +8 

w  x  y 


Now  consider  any  other  component  with  mean  z  and  weight  8^  . 
The  distance  between  this  component  and  either  of  the  two  which  have 
been  merged  must  be  greater  than  or  equal  to  d  >  so: 


2  2 

d  .  4  d 

mm  xz 


-  £(ax  •  6Jlh- 


(C-1) 


2  2 

d  .  4  d 

mm  yz 


£(3y  •  \)H l  '  5! 


(C-2) 


To  confirm  that  the  minimum  distance  increases  monotonically  as 


reduction  proceeds  (is  it  is  not  subject  to  reversals),  we  must  prove 


that : 
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2  2 
cT  >  d  . 
zw  '  mm 


Now: 


zw 


f(sw  ,  6z  }|  I  z  -  w| 


=  f 


(s  , 

\  W 


- 


8  x  +  6  y  - 
X  -  V  i  2 


■  £(6w  •  6z)H(5-z)--t(5-2) 

w 


=  f 


(Bw  •  6jjl 
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I  z  -  y  (  |  2  +  -|  |  |  x  -  y  |  |  2 


8 


w 


w 


[z  -  x! 


:(S»  •  Bz) 


8  !  I  z  —  y  j  I  +8  |  |  z  -  x  [ 

iy-  —  x  - 


|Z  -  y | 


9  6  6 

2  x  y 
8 

w 


|x  -  21 


■  i  2 


|x  -  y  I 


Since 


f(BV  •  \)  _  Sw 


w 


3  (6  +  6  \  8+8 

wl  w  z  J  w  z 


and  using  the  definition  of  the  distance  measure: 


zw  8w  +  8z  )  \  y  zj  yz 


8  +  e_Jd2z  +  ^6x  +  sz)d2_  -  g_  d2,_ 


xz  z  mm 
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Hence  from  (C-1)  and  (C-2) : 


d2  *  1 

zw 


'  B  +  6 
w  z 


W  Z 


-2)  : 

s. 

\  ,2  / 

\  ,2 

,2 

S 

+  8  )d  .  +  [f 

3  +  8  jd  .  - 

S  dz . 

r y 

z/  mxn  \ 

x  z)  mm 

z  mm 

(e  + 

8  +  6  W2.  =  d2. 

\  y 

z  x/  mm 

mm 

This  completes  the  proof. 
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COMPUTATIONAL  REQUIREMENTS  OF  THE  REDUCTION  ALGORITHMS 


D.1  The  Joining  Algorithm  (Fig  3.1) 


As  explained  in  section  3.6.1  the  operation  of  the  Joining 

2 

Algorithm  is  centred  around  a  symmetric  distance  matrix  (d^)  with 

d..  =0.  Thus  it  is  necessary  to  store  the  upper  triangular  part  of 
ii  2 

N  -  N 

the  matrix  which  occupies  - ^ -  storage  locations,  where  N  is  the 

original  number  of  components  in  the  mixture. 


The  most  time-consuming  operations  are  the  evaluation  and  com¬ 
parison  of  the  distance  measures.  The  calculation  of  each  distance 

involves  the  evaluation  of  a  quadratic  form  which  requires  of  the  order 

2  2  .  . 

of  n  multiplications  and  n  additions,  where  n  is  the  dimension 

of  the  state  space.  Note  however  that  the  matrix  P  in  the  distance 

formula  equation  (3.8)  is  constant,  since  the  merging  of  components 

preserves  the  overall  mixture  covariance.  Thus  only  one  matrix 

inversion  suffices  for  all  distance  evaluations. 


To  reduce  a  mixture  from  N  to  M  components,  the  number  of 
distance  calculations  required  is 


N-M+1 


„  N  -  N  .  V* 
NDJ  -  2  + 


(N  -  i) 


=  N(N  -  2)  -  J(M  -  3)  . 


(D-1) 


The  identification  of 


where  i  <  j  ,  requires 
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f(m  -  1)  -  1 

comparisons  at  each  iteration,  where  m  is  the  number  of  remaining 
components.  Thus  the  total  number  of  comparisons  required  during  the 
reduction  of  a  mixture  from  N  to  M  components  is 

N 

ncj  ■  £{?<" 

m=M 


=  -g-(N  -  M  +  1) 


+  N(M 


(M  -  1)‘ 


(D-2) 


There  are  N  -  M  +  1  terms  in  the  summation  because  one  extra  evaluation 
of  min  d..  is  required  for  the  algorithm  stopping  criterion.  Note  that 

i,j  1J  _  3 

the  required  number  of  comparisons  is  of  order  N  and  the  number  of 

2 

distance  calculations  is  of  order  N  .  The  number  of  these  operations 

is  shown  in  Fig  D.1  as  a  function  of  M  ,  for  the  cases  N  =  100  and 

N  =  15.  The  value  of  N  clearly  dominates  the  number  of  operations, 

and  although  this  decreases  with  M  ,  the  decrease  is  small  while 
N 

M  <  .  Note  that  the  number  of  comparisons  required  to  find  the  com¬ 

ponents  with  the  lowest  S  weights  (see  Fig  3.1)  has  not  been  included 
in  the  above  total  as  their  number  is  relatively  insignificant. 

D.2  The  Clustering  Algorithm  (Fig  3.12) 

Unlike  the  Joining  Algorithm  whose  computational  cost  can  be  pre¬ 
dicted  quite  accurately,  the  cost  of  the  Clustering  Algorithm  is  very 
dependent  on  how  quickly  the  mixture  components  are  clustered  and  on 
how  many  iterations  are  required  to  adequately  reduce  the  mixture.  The 
most  time-consuming  operations  for  the  algorithm  are  distance  evaluations 
and  comparisons;  the  merging  of  selected  components  into  a  single  Gaussian 
is  relatively  inexpensive. 
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In  the  formation  of  a  single  cluster,  the  distance  from  the 
cluster  centre  to  every  unclustered  component  must  be  evaluated.  The 
total  number  of  distance  calculations  required  for  an  iteration  of  the 
algorithm  is 


NDC 


m'(n' 


M'-1 


i  =  1 


(D— 3 ) 


where  N'  is  the  number  of  components  at  the  start  of  the  iteration, 

M*  is  the  total  number  of  clusters  formed  during  the  iteration  and  m. 

is  the  number  of  components  combined  into  the  ith  cluster.  For  given 

n'  and  m'  ,  bounds  on  N*  may  be  obtained  by  considering  the  most  and 

least  favourable  values  for  m.  .  The  lower  bound  is  obtained  when 

1 

N'  -  (M '  -  1)  components  are  combined  into  the  first  cluster  so  that  all 
further  clusters  only  contain  one  element,  ie 


i  N '  -  (M '  -  1 )  if  i  =  1 

m .  =  \ 

l  ) 

\  1  otherwise  .  (D-4) 

Thus  the  lower  bound  on  is  given  by,  from  equation  (D-3) , 

LDC  =  N'  +  ”  3)  •  (D-5) 

The  upper  bound  is  obtained  if  the  first  M’  -  1  clusters  only  contain 
one  component,  so 


m.  =  1 

l 


for  i  M’ 


and 


1 


n^,  4  N'  -  (M'  -  1) 


(D-6) 


Thus  the  upper  bound  on  is  given  by,  from  equation  (D-3), 


lDC  =  M 


’(n’  -4-  (M’  +  ij) 


(D-7 ) 
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K 


* 


Also,  since  Che  distance  measure  for  each  cluster  is  normalized  by  the 
covariance  of  the  cluster  centre  (see  equation  (3.9)),  M'  matrix 
inversions  are  required.  In  Fig  D.2,  and  L^c  are  shown  as  a 

function  of  M'  for  N'  =  100  and  N'  =  15. 

To  select  the  components  for  clustering,  each  of  the  N' 
distances  must  be  compared  with  the  clustering  theshold.  Also  com¬ 
parisons  are  required  to  identify  those  components  which  are  closest 
together  so  that  they  can  be  merged  if  no  components  are  clustered. 
However  if  components  are  clustered  the  minimum  distance  is  no  longer 
required,  and  so  the  search  for  the  closest  components  is  abandoned  at 
this  stage.  Thus  the  minimum  number  of  comparisons  required  for 

an  iteration  occurs  when  nn  is  given  by  equation  (D-4)  and  when  the 
first  component  to  be  examined  is  clustered,  so 


Lcc  •  Lic  ■  (D-8) 

The  maximum  number  of  comparisons  required  for  an  iteration  occurs 

when  every  cluster  contains  only  one  component  (no  components  are 
clustered) ,  ie 


m.  =  1  for  all  i  . 

l 

In  this  case 


(D-9) 


Clearly  for  mixtures  with  a  large  number  of  components,  such  as 
N'  =  100,  the  first  iteration  of  the  Clustering  Algorithm  could  involve 
a  very  large  number  of  distance  evaluations  and  comparisons  (see  Fig  D.2). 
However,  in  practice  it  has  been  found  that  the  number  of  operations  is 
usually  well  below  the  upper  bounds  u'  and  u' 


,  and  that  mixtures 
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with  a  large  number  of  components  are  usually  significantly  reduced 
after  the  first  iteration  ( ie  M'  «  N').  Thus  if  further  iterations  are 
necessary,  the  number  of  components  involved  is  usually  fairly  modest. 

The  Clustering  Algorithm  would  be  most  expensive  in  the  unlikely 
circumstance  of  no  component  ever  being  clustered.  In  this  case  the 
closest  two  components  would  be  combined  at  each  iteration,  so  the 
mixture  would  only  be  reduced  by  one  component  per  iteration.  This 
provides  an  upper  bound  on  the  total  number  of  operations.  For  this 
worst  case  we  also  assume  that  B^,  =0,  so  that  every  one  of  the  N1 
components  at  the  start  of  an  iteration  is  considered  as  a  possible 
cluster  centre.  Thus  the  number  of  distance  evaluations  and  comparisons 
for  each  iteration  is  given  by  and  with  M'  =  N' .  Also  N' 

matrix  inversions  are  required  for  each  iteration.  Thus  an  upper  bound 
on  the  total  number  of  matrix  inversions  required  to  reduce  a  mixture 
from  N  to  M  components  is  given  by 

N 

N'  =  "  M) (N  +  M  +  ^  •  (D- 10) 

N ' =M+ 1 


The  total  number  of  distance  calculations  is  bounded  by 


DC 


N 

1 


4n'(N’  -  1)  =  4-(N  -  m)(n2  +  NM  +  M2  -  1) 

Z  0 


N '  =M+ 1 


(D-1 1) 


and  the  total  number  of  comparisons  is  bounded  by 


U 


CC 


2U 


DC 


(N  -  M) 


(D— 12) 
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These  upper  bounds  on  Che  number  of  operations  are  shown  in  Fig  D.3  as 
a  function  of  M  ,  for  N  =  100  and  N  =  15.  In  the  best  possible  case  all 
components  that  are  clustered  are  combined  into  the  first  cluster  on  the 
first  iteration.  Thus  M  is  a  lower  bound  on  the  total  number  of  matrix 
inversions  and  equation  (D-5) ,  with  N'  =  N  and  M'  =  M,  gives  a  lower 
bound  on  the  total  number  of  distance  calculations  and  comparisons. 

D . 3  Comparison  of  operation  counts  for  the  two  algorithms 

If  the  original  number  N  of  components  in  the  mixture  is  large 
compared  with  the  number  M  of  components  after  reduction,  the  number 
of  operations  required  by  the  Joining  Algorithm  lies  between  the  upper 
and  lower  bounds  of  the  number  of  operations  for  the  Clustering  Algorithm 
This  is  shown  in  Table  D.l.  For  the  simulation  example  reported  in 
Chapter  4,  the  Clustering  Algorithm  was  consistently  more  efficient  than 
the  Joining  Algorithm.  Also  it  should  be  noted  that  for  the  Joining 
Algorithm  a  large  distance  matrix  must  be  stored.  For  the  Clustering 
Algorithm,  storage  requirements  over  those  necessary  to  hold  the  mixture 
components  are  negligible. 

Table  D.l 

Operation  counts  for  the  Joining  Algorithm  and  the 
Clustering  Algorithm  when  N  is  large 
compared  with  M 


Operation 

Joining 

Algorithm 

Upper  bound 
for  Clustering 
Algorithm 

Lower  bound 
for  Clustering 
Algorithm 

Distance  calculations 

NDJ  -  oao 

v  ■  o(n3) 

Sc  ■  0(N) 

Comparisons 

Sj  -  0(N3) 

ucc  *  2NCJ 

Sc  ’  0<s> 

Matrix  inversions 

1 

0(N2n 

M 

Number  of  comparisons  NCJ 
Number  of  distance  calculations  N 


AO  l* 


v.n  :  r  3  3'.  r>  -  ^ 
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RECURSIVE  SOLUTION  OF  MULTIPLE  SENSOR  FILTER  OF  SECTION  7.3 

Since  only  one  measurement  from  each  of  the  Ng  sensors  may  be 
true,  a  measurement  association  hypothesis  on  the  data  at  a  particular 
time  step  may  be  denoted 


where  il  .  indicates  that  the  j  measurement  from  sensor  u  is 

ui  u 

u 

true  if  j  f  0  and  t  indicates  that  all  measurement's  from  sensor 
u  uo 

u  „  are  false.  Since  each  sensor-  is  -independent., information-  from  each 
sensor  may  be  incorporated  sequentially  using  the  update  relations  of 
Chapter  2.  Suppose  that  data  from  the  first  u-1  of  the  sensors 

-k 

have  been  incorporated  and  let  ^  denote  a  hypothesis  on  the 

measurements  from  the  u-1  sensors  and  from  all  previous  time  steps 
(the  subscript  k  and  the  conditioning  on  &  have  been  omitted) .  The 
subscript  i  ,  which  enumerates  all  these  hypotheses,  runs  from  1  to 

k 

n  .  .  To  incorporate  measurements  from  sensor  u  ,  the  set  of  feasible 
u-  1 

hypotheses  must  be  widened  to  include 

Kt  *  (*V,  i-  %) 

for 

k 

i  =  I  ,  ....  n  ,  and  j  =  0  ,  . . .  ,  m 

u-1  u 

and 

c  =  (i  -  1 )  (m  +1)+j  +  1 
u 


(E-1 ) 
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Thus  Che  posterior  pdf  of  x  after  the  inclusion  of  measurements  from 
sensor  u  is  given  by 


p(-izu)  *  Z  Z  p(£kj  -^u-i  i  •  <h\*ui  iizu 

j  =0  i  =  1 

where  Z*  =  {  ,  •••  .  \}  ■ 


As  in  section  2.3  2,  it  can  be  shown  that 


(xU  .  ,  <T*  ,  2*)  =  ~Wx  ;  x*..  ,  P*..) 

\“l  Uj  U-1  1  u /  -U1J  Ulj/ 


where  x  . . 

-uij 


* 

K  . 
ui 


and 


* 

P  .  . 

Ulj 


: k 

s  . 

ui 


_* 

x  .  + 
-ui 


*  /  _*  \ 

K  ■  (  z  .  -  H  x  .  }  if 

ui  \ -uj  u-ui/ 


if 


P*. .  HT  R_1  for  j  f  0 

Ulj  u  u  u 


r  * 
M  . 
ui 


ui 


*  t  *-1  * 

M  .  H  S  .  H  M  . 

Ul  U  Ul  U  UI 


if  j  =  0 


*  T 

H  M  .  H  +  R 
U  Ul  u  u 


j  f  0 

j  =  o 


if  j  +  0 


J 


,E-2) 


—  *  * 

In  the  above  relations  x  .  and  M  .  are  the  mean  and  covariance 

-ui  ui 

of  the  Gaussian  distribution  of  x  under  hypothesis  <  .  and  so 

u-1  1 

are  available  from  the  processing  of  data  from  sensor  u-1  .  Note  that 
with  a  minor  change  of  notation,  equation  ( E—  2 )  is  identical  to 
equations  (2.8)  and  (2.9).  Likewise,  bv  analogy  with  section  2.3.2, 


Appendix  E 


281 


REFERENCES 


Author 

S.S.  Blackman 


Y.  Bar-Shalom 
T.E.  Fortmann 

Y.  Bar-Shalom 


A.  Farina 
S.  Pardini 


I.R.  Goodman 
H.L.  Wiener 
W.W.  Willman 


P.  '’mith 
G.  Buechler 


C.L.  Morefield 


R.W.  Sittler 


J.J.  Stein 
S.S.  Blackman 


Title,  etc 

Multiple-target  tracking  with  radar 
applications . 

Artech  House  0986) 

Tracking  and  data  association. 

Academic  Press  (1988) 

Tracking  methods  in  a  multi-target 
environment. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-23, 

No. 4,  August  1978 

Survey  of  radar  data-processing  techniques  in 
air-traffic-control  and  surveillance  systems. 
IEE  Proceedings,  Vol  127,  Part  F,  No. 3, 

June  1980 

Naval  ocean-surveillance  correlation 
handbook  (1979) 

NRL  Report  8402,  Naval  Research  Laboratory, 
Washington,  DC,  September  1980 

A  branching  algorithm  for  discriminating 
and  tracking  multiple  objects. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-20, 
No.1,  February  1975 

Application  of  0-1  integer  programming  to 
multitarget  tracking  problems. 

IEEE  Trans  on  Automatic  Control,  Vcl  AC-22, 

No. 3,  June  19/7 

An  optimal  data  association  problem  in 
surveillance  theory. 

IEEE  Trans  on  Military  Electronics,  Vol  MIL-8, 
No. 2,  April  1964 

Generalized  correlation  of  multi-target  track 
data . 

IEEE  Trans  on  Aerospace  and  Electronic 
Systems,  Vol  AES-11,  No. 6,  November  1975 


283 


* 


No .  Author 

10  R.A.  Singer 
R.G.  Sea 

K.B.  Housewright 


11  Y.  Bar-Shalom 

E.  Tse 

12  K.  Birmiwal 

Y.  Bar-Shalom 

13  Y.  Bar-Shalom 

G.D.  Marcus 


14  A.  HoulSs 

Y.  Bar-Shalom 

15  Y.  Bar-Shalom 

K.  Birmiwal 

16  T.E.  Fortmann 
Y.  Bor-Shalom 
M.  Scheffe 

S.  Gelfand 

17  Y.  Bar-Shalom 


REFERENCES  (continued) 

Title,  etc 

Derivation  and  evaluation  of  improved 
tracking  filters  for  use  in  a  dense  multi¬ 
target  environment. 

IEEE  Trans  on  Information  Theory,  Vol  IT-20, 
No. 4,  July  1974 

Tracking  in  a  cluttered  environment  with 
probabilistic  data  association. 

Automatica,  Vol  11,  September  1975 

On  tracking  a  manoeuvring  target  in  clutter. 
IEEE  Trans  on  Aerospace  and  Electronic 
Systems,  Vol  AES-20,  No. 5,  September  1984 

Tracking  with  measurements  of  uncertain 
origin  and  random  arrival  times. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-25, 

No. 4,  August  1980 

Multisensor  tracking  of  a  manoeuvring  target 
in  clutter. 

Proceedings  of  NAECON  (1987) 

Consistency  and  robustness  of  PDAF  for  target 
tracking  in  cluttered  environments. 

Automatica,  Vol  19,  No. 4  (1983) 

Detection  thresholds  for  tracking  in  clutter  - 
a  connection  between  estimation  and  signal 
processing . 

IEEE  Trans  on  Automatic  Control,  Vol  AC-30, 

No. 3,  March  (1985) 

Extension  of  the  Probabil itistic  Data 
Association  filter  to  multi-target  environment. 
Proceedings  of  the  fifth  Sumposium  on 
Non-linear  Estimation,  San  Diego,  CA, 

September  1974 


284 


No .  Author 

18  T.E.  Fortmann 
Y.  Bar-Shalom 
M.  Scheffe 


19  T.E.  Fortmann 
Y.  Bar-Shalom 
M.  Scheffe 

20  KtC.  Chang 

Y.  Bar-Shalom 


21  KtC .  Chang 

CvY .  Chong 

Y.  Bar-Shalom 

22  KtC.  Chang 

Y.  Bar-Shalom 

23  D.B.  Reid 


24  R.J.  Kenefic 


REFERENCES  (continued) 

Title,  etc  • 

Multi-target  tracking  using  joint 

f 

probabilistic  data  association. 

Proceedings  of  the  19th  IEEE  Conference  on 
Decision  and  Control.  Albuquerque,  NM, 

December  1980 

Sonar  tracking  of  multiple  targets  using 
joint  probabilistic  data  association. 

IEEE  Journal  of  Oceanic  Engineering,  Vol  OE-8, 

No. 3,  July  1983 

Joing  probabilistic  data  association  for 
multi-target  tracking  with  possibly  unresolved 
measurements  and  manoeuvres. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-29, 

No. 7,  July  1984 

Joint  probabilistic  data  association  in 
distributed  sensor  networks. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-31, 

No. 10,  October  1986 

A  simplification  of  the  JPDAM. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-31, 

No. 10,  October  1986 

An  algorithm  for  tracking  multiple  targets. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-24, 

No. 6,  December  1979 

Optimum  tracking  of  a  manoeuvring  target  in 
clutter. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-26,  » 

No. 3,  June  1981 

A  Guassian  sum  approach  to  the  multi-target 
identification-tracking  problem. 

Automatica,  Vol  11,  No. 3,  May  1975 


25 


D.L.  Alspach 


285 


REFERENCES  (continued) 

No.  Author  Title,  etc 

26  S.  Mori  Tracking  and  classifying  multiple  targets 

C-Y.  Chong  without  a  priori  identification. 

E.  Tse  IEEE  Trans  on  Automatic  Control,  Vol  AC-31, 

R.P.  Wishner  No. 5,  May  1986 

27  A.H.  Jazwinski  Stochastic  processes  and  filtering  theory. 

Academic  Press  (1970) 

28  R.C.  Lee  Optimal  estimation,  identification  and 

control . 

MIT  Research  Monograph  28  (1964) 

29  A.E.  Bryson  Applied  optimal  control. 

Y.C.  Ho  John  Wiley  and  Sons  (1975) 

30  A. P .  Sage  Estimation  theory  with  applications  to 

J.L.  Melsa  communications  and  control. 

Mc.Graw-Hill  (1971) 

31  J.S.  Meditch  Stochastic  linear  estimation  and  control. 

Mc.Graw-Hill  (1969) 

32  A.  Gelb  Applied  optimal  estimation. 

MIT  Press  (1974) 

33  B.S.  Everitt  Finite  mixture  distributions. 

D.J.  Hand  Chapman  and  Hall  (1981) 

34  D.M.  Titterington  Statistical  analysis  of  finite  mixture  distributions . 

A.F.M.  Smith  John  Wiley  and  Sons  (1985) 

U.E.  Makov 

35  V.  Nagarajan  An  algorithm  for  tracking  a  manoeuvring 

R.N.  Sharma  target  in  clutter. 

M.R.  Chidambara  IEEE  Trans  on  Aerospace  and  Electronic  Systems, 

Vol  AES-20,  No. 5,  September  1984 

36  H.W.  Sorenson  Parameter  estimation,  principles  and 

problems . 

Marcel  Dekker  Inc.  (1980) 


REFERENCES  (continued) 


.  Author 

M.  Athans 
R.H.  Whiting 
M.  Gruber 


K.R.  Pattipati 
N.R.  Sandell 


A.G.  Jaffer 
S.C.  Gupta 

H.A.P.  Blom 


G.A.  Ackerson 
K.S.  Fu 

J.K.  Tugnait 


J.K.  Tugnait 
A. H.  Haddad 

D.G.  Lainiotis 
S.K,  Park 


J . L .  Weiss 
T.N.  Upadhyay 
R.  Tenney 


Title,  etc 

A  sub-optimal  estimation  alorithm  with 
probabilistic  editing  for  false  measurements 
with  applications  to  target  tracking  with 
wake  phenomena. 

IEEE  Trans  on  Automatic  Control,  Vol  AC-22, 

No. 3,  June  1977 

A  unified  view  of  state  estimation  in 
switching  environments. 

Proceedings  of  American  Control  Conference 

0983) 

On  estimation  of  discrete  processes  under 
multiplicative  and  additive  noise  conditions. 
Information  Sciences,  Vol  3,  No. 3,  July  1971 

Overlooked  potential  for  systems  with 
Markovian  coefficients. 

Proc.  of  25th  IEEE  Conference  on  Decision  and 
Control,  Athens,  December  1986 

On  state  estimation  in  switching  environments. 
IEEE  Trans  on  Automatic  Control,  Vol  AC-15, 
No.1,  February  1970 

Detection  and  estimation  for  abruptly  changing 
systems . 

Automatics,  Vol  18,  No. 5,  September  1982 

A  detection-estimation  scheme  for  state 
estimation  in  switching  environments. 
Automatica,  Vol  15,  No. A,  July  1979 

On  joint  detection,  estimation  and  system 
identification:  discrete  data  case. 
International  Journal  of  Control,  Vol  17, 

No. 3  (1973) 

Finite  computable  filters  for  linear  systems 
subject  to  time  varying  model  uncertainty. 
Proceedings  of  NAECON  (1983) 


287 


REFERENCES  (concluded) 

No.  Author  Title,  etc 

46  T.  Kailath  The  divergence  and  Bhattacharyya  distance 

measures  in  signal  selection. 

IEEE  Trans  on  Communication  Technology, 

Vol  COM-15,  No . 1 ,  February  1967 

47  D.J.  Hand  Discrimination  and  classification. 

John  Wiley  (1981) 

48  M.R.  Anderberg  Cluster  analysis  for  applications. 

Academic  Press  (1973) 

49  A.W.  Bridgewater  Analysis  of  second  and  third  order  steady- 

state  tracking  filters. 

From:  AGARD  Conference  Proceedings  No. 252, 
Monterey,  California  (1978) 

50  D.J.  Salmond  The  characteristics  of  the  second-order 

target  model. 

Royal  Aircraft  Establishment,  Technical 
Report  TR  85071,  August  1985 

51  M.  Gauvrit  Bayesian  adaptive  filter  for  tracking  with 

measurements  of  uncertain  origin. 

Automatica,  Vol  20,  No. 2  (1984) 

52  H.A.P.  Blom  An  efficient  filter  for  abruptly  changing 

systems . 

Proc  of  23rd  IEEE  Conference  on  Decision  and 
Control,  Las  Vegas  (1984) 

53  D.J.  Salmond  The  Kalman  tracking  filter,  the  a-3  filter 

and  smoothing  filters. 

Royal  Aircraft  Establishment,  Technical 
Memorandum  AW48  (1981) 

54  D.J.  Salmond  Fixed  point  smoothing  filters  and  trajectory 

estimation. 

Royal  Aircraft  Establishment,  Technical 
Memorandum  AW58  (1982) 

55  A.K.  Mahalanabis  A  Joint  Probabilistic  Data  Association 

Bin  Zhou  Smoothing  Algorithm  for  Multi-target  Tracking. 

Proceedings  of  American  Control  Conference  (1988) 


F59I0/1 


REPORT  DOCUMENTATION  PAGE 

Overall  security  classification  of  this  page 


UNLIMITED 


As  far  as  possible  this  page  should  contain  only  unclassified  information.  If  it  is  necessary  to  enter  classified  information,  the  box 
above  must  be  marked  to  indicate  the  classification,  e.g.  Restricted,  Confidential  or  Secret. 


1 .  DRIC  Reference 

2.  Originator’s  Reference 

3.  Agency 

4.  Report  Security  Classification/Marking 

(to  be  added  by  DRIC) 

RAE  TM  AW  121 

Reference 

UNLIMITED 

5.  DRIC  Code  for  Originator 
7673000W 


5a.  Sponsoring  Agency’s  Code 


6.  Originator  (Corporate  Author)  Name  and  Location 

Royal  Aerospace  Establishment,  Farnborough,  Hants,  UK 


6a.  Sponsoring  Agency  (Contract  Authority)  Name  and  Location 


7.  Title 


Tracking  in  uncertain  environments 


7a.  (For  Translations)  Title  in  Foreign  Language 


7b.  (For  Conference  Papers)  Title,  Place  and  Date  of  Conference 


8.  Author  1.  Surname,  Initials 
Salmond,  D.J. 


II.  Contract  Number 


9a.  Author  2 


12.  Period 


9b.  Authors  3, 4  . 


13.  Project 


10.  Date 
Septembeif 
1989 


!  Pages 
286 


Refs. 

55 


14.  Other  Reference  Nos. 


15.  Distribution  statement 

(a)  Controlled  by  - 

(b)  Special  limitations  (if  any)  — 

If  it  intended  that  a  copy  of  this  document  shall  be  released  overseas  refer  to  RAE  Leaflet  No.3  to  Supplement  6  of 
MOD  Manual  4. 


16.  Descriptors  (Keywords)  (Descriptors  marked  *  are  selected  from  TEST) 

Tracking  filters.  Mixture  distributions.  Bayesian  filters. 
Data  fusion.  PDAF. 


Kalman  filters. 


17.  Abstract 

This  study  concerns  the  problem  of  tracking  a  target  when  the  origin  of  sensor 
measurements  is  uncertain.  The  full  Bayesian  solution  to  this  type  of  problem  gives 
rise  to  Gaussian  mixture  distributions,  which  are  composed  of  an  ever  increasing 
number  of  components.  Two  algorithms  have  been  developed  for  approximating  such 
distributions,  so  allowing  practical  tracking  filters  to  be  derived. 

For  a  standard  tracking  problem,  simulation  has  been  used  to  determine  the 
significant  range  of  problem  parameters  where,  at  the  expense  of  extra  computation, 
the  new  algorithms  give  a  substantial  performance  improvement  over  the  well-known 
Probabilistic  Data  Association  Filter.  The  algorithms  have  also  been  used  to  derive 
Bayesian  filters  for  data  fusion  problems  and  for  tracking  a  target  in  the  presence 
of  intermittent  interfering  measurements. 


RAE  Form  A 143  (revijed  October  1980) 


